Tuesday, August 23, 2022

DFT_grid.m

Pitch detection by DFT grid
 
Acknowledgements

1. In the late 80's and early 90's, I studied Byzantine music in the byz music school of Piraeus (Greece). There, I learned the basics of byz music solfège and I decided to create a project to help me learn to sing.
2. In 90's and in 2004 I was studing DSP books in Eugenidis Foundation Library. There, I learned DSP basics.
3. In 2016 I studied Mr. Richard Lyons, a mentor for DSP specialists, book: Understanding Digital Signal Processing, and learned DFT.
4. In 2017 I asked at dsp.stackexchange question: Can I test any frequency with DFT? Marcus Müller, also mentor for DSP specialists, answered *yes*. So I learned can perform DFT grid.
5. Mr. Jeffrey Clark, at matlab suggested me 16384 for second argument in function fft.
 
Introduction

1. I'm homeless and can't implement this code in DSP kit. I offer it for free. It detects fundamental frequency (fundFreq), or pitch, or 1st harmonic, using DFT grid about fundFreq detected by FFT or any other method detecting pitch. My method is the condition user give pitch between one tone (9/8) lower than note's frequency and one tone higher than it, even a student of music can do. Then program, by matlab's abs(fft()), find indices up to 1.2 note's frequency. 1st harmonic's abs is the maximum.
2. Indeed it should be embedded in a project executing, in part, in DSP kit.
3. Project should have two parts. One, executing in DSP kit
performs DFT grid, and other, executing in PC's processor, performs the graphics.
4. Graphics part represents musical note as rectangle with height representing note's interval from base by some unit e.g. cent of semitone. Wide represents note's duration (temp). Color represents sound's amplitude, volume. Rectangle moves to left with speed related with wide. At the left side of window a pointer moves up and down. Its point's height from base, represents voice's pitch. For get an idea see Metrw-pitch graphics. 
5. You can create vertical scale with zero at its middle, upper half representing one semitone, and lower also one semitone, just a micrometer. Zero represents the note rectangle's upper edge. A speed regulator is a must, as well a horizontal scroll bar for user can go back.
6. To measure the volume of the sound, we need to add a snippet. Also we need a third scale for sound's volume.
 
Accuracy error
 
1. Singers, chanters, violists have to give even the smallest musical interval, diesis. But fundFreq, detected by FFT, has accuracy error about a diesis, so projects detecting fundFreq by FFT do not satisfy musicians. Similar errors occur in other methods detecting pitch. Present code detects funbdFreq with accuracy better than 0.1 diesis. So when this code is embedded in a project executing in DSP kit, user can see in real time 8 times per second his voice's pitch error and regulate voice's pitch for give diesis. Question is whether MATLAB's code executing in dsp kit, retains same accuracy as in _MATLAB online R2020b. See answer from Mr. Ben to my question at dsp.stackexchange.
2. Error depends on the ratio Segm/SampFreq. As greater the ratio as higher the accuracy. But as lower the pitch's iteration rate in graphics.
3. Error depends also on fundFreq. As lower fundFreq as greater accuracy error. Reason for both is obvious: As more FundFreq's periods in Segm, as better accuracy. Moreover it depends, to a lesser degree, on amplitude, phase, and upper harmonics.
 
General instructions

1. For avoid analog filter which distorts signal, I suggest sampling frequency 96000 samples/sec. Some noise reduction snippet is necessary. After that a
2. Low pass FIR filter with transition band 7000-8000 khz.
3. Choosing 1 sample every 6, SampFreq is reduced to16000 samples/sec.
4. Programmer can change numbers.
5. For a conical presentation of FFT amplitudes see answers to my question at dsp.stackexchange.
 
 
The code    

format default;
SampFreq = 16000;
Segm = 1:2048;
baseFreq = 256;
noteFreq = baseFreq*100/81; % up to 2666, for 3rd harmonic < 8000
x = 1.1; % 8/9 < x < 9/8
voicePitch = x * noteFreq;
% Generating signal
FirstHarmAngles = voicePitch*2*pi/SampFreq*Segm+1.9*pi;
SinFirstHarmAngles = sin(FirstHarmAngles);
SecondHarmAngles = voicePitch*2*2*pi/SampFreq*Segm+2.9*pi;
SinSecondHarmAngles = sin(SecondHarmAngles);
ThirdHarmAngles = voicePitch*3*2*pi/SampFreq*Segm+0.3*pi;
SinThirdHarmAngles = sin(ThirdHarmAngles);
Xn=170000*SinFirstHarmAngles+220000*...
SinSecondHarmAngles+150000*...
SinThirdHarmAngles;
% Finding FFT pitch
magn = abs(fft(Xn, 16384));
OnePointTwo = round(noteFreq * 1.2);
[maxMagn, FFTpitch] = max(magn(1:OnePointTwo));
% Performing DFT grid
LowSpanLimit = 0.8*FFTpitch;
HighSpanLimit = 1.2*FFTpitch;
PartsOfSpan = 1001;
TestFreqs = logspace(log10(LowSpanLimit), log10(HighSpanLimit), PartsOfSpan);
MagnSqrd = ones(1, PartsOfSpan);
for f = 1:PartsOfSpan
Angles = TestFreqs(f)*2*pi/SampFreq*Segm;
XnCos = sum(Xn.*cos(Angles));
XnSin = sum(Xn.*-sin(Angles));
MagnSqrd(f) = XnCos.^2+XnSin.^2;
end
% Finding GRID pitch
[maxMagnSqrd, maxMagnSqrdIndex] = max(MagnSqrd);
GRIDpitch = TestFreqs(maxMagnSqrdIndex);
voiceRatioInterval = GRIDpitch/baseFreq;
% Following should be incorporated in graphics part.
mainUnit = 9/8; % for cents of semitone it is 2
tempUnit = 100; % for cents of semitone it is 1200
noteRatioInterval = noteFreq / baseFreq;
tempNOTEpitch = log(noteRatioInterval)/log(mainUnit)*tempUnit;
tempGRIDpitch = log(voiceRatioInterval)/log(mainUnit)*tempUnit;
% Display results
noteFreq
voicePitch
FFTpitch
GRIDpitch
tempGRIDpitch
tempNOTEpitch

Tuesday, February 2, 2021

Metrw-pitch graphics

Indroduction
 
1. I'm homeless and can't implement this code in DSP kit. I offer it for free.
2. Project detects fundamental frequency (fundFreq), or pitch, performing DFT grid around fundFreq detected by FFT. Indeed it should be embedded in a project executing, in part, in DSP kit.
2. Project should have two parts. One, executing in DSP kit detects fundFreq (see DFT_grid.m) and other, executing in PC's processor, has the graphics.
3. Graphics part represents musical note as rectangular with height representing note's interval from base by some unit e.g. cent of semitone. Wide represents note's duration. Color represents sound's volume. At a specific speed, the rectangle (Quad) travels to the left.
4. At the left side of window a pointer moves up and down. Its point's height from base represents voice's pitch.
5. Another vertical scale with zero at its middle, upper half representing one semitone, and lower also one semitone, is micrometer. Zero represents the note rectangular's upper edge.
6. Third vertical scale - not shown in graphics - indicates the volume of the sound.
7. A speed regulator is a must, as well a horizontal scroll bar for the user can go back.
8. For get an idea follow this tutorial (it's not updated but you only need set up on your project GLFW only) or this (it's updated but you need set up GLFW and GLAD). Then run follow code.
9. I got assistance from Mr. Dougbinks in learning how to make 2D graphics using OpenGL.
______________________________________________
#include <GLFW/glfw3.h>
#include <gl/GL.h>
#include <iostream>

void render_loop(float a, float b, float x, int lineHeight2, int lineHeight1)
{
    glClearColor(0.2f, 0.2f, 1.0f, 0.1f);
    glClear(GL_COLOR_BUFFER_BIT | GL_DEPTH_BUFFER_BIT);

    glBegin(GL_QUADS);

    glColor3f(0.0, 1.0, 0.0);
    glVertex2f(a + x, 0.0f);
    glVertex2f(b + x, 0.0f);
    glVertex2f(b + x, 700.0f);
    glVertex2f(a + x, 700.0f);

    glColor3f(1.0, 1.0, 1.0);
    glVertex2f(0, 0.0f);
    glVertex2f(200, 0.0f);
    glVertex2f(200, 900.0f);
    glVertex2f(0, 900.0f);

    glEnd();

    glBegin(GL_LINES);

    glColor3f(1.0, 0.0, 0.0);
    glVertex2i(100, lineHeight2);
    glVertex2i(200, lineHeight2);

   glColor3f(1.0, 0.0, 1.0);
    glVertex2i(0, lineHeight1);
    glVertex2i(100, lineHeight1);

    glColor3f(0.0, 0.0, 0.0);
    glVertex2i(100, 0);
    glVertex2i(100, 800);

    glColor3f(0.0, 0.0, 0.0);
    glVertex2i(0, 400);
    glVertex2i(100, 400);

    glEnd();

}

/* program entry */
int main(int argc, char* argv[])
{
    GLFWwindow* window;

    if (!glfwInit())
    {
        std::cout << "Failed to initialize GLFW\n";
        exit(EXIT_FAILURE);
    }

    glfwWindowHint(GLFW_RESIZABLE, GL_FALSE);
    window = glfwCreateWindow(1800, 800, "LearnSinging", NULL, NULL);
    glfwSetWindowPos(window, 10, 40);

    if (!window)
    {
        std::cout << "Failed to open GLFW window\n";
        glfwTerminate();
        exit(EXIT_FAILURE);
    }


    glfwMakeContextCurrent(window);
    glfwSwapInterval(1);

    // set up view
    glViewport(0, 0, 1800, 800);
    glMatrixMode(GL_PROJECTION);
    glLoadIdentity();

    // see https://www.opengl.org/sdk/docs/man2/xhtml/glOrtho.xml
    glOrtho(0.0, 800.0, 0.0, 800.0, 0.0, 1.0); // this creates a canvas
    // you can do 2D drawing
    // Main loop
    float x = 0.0f;
    /*If screen's frequency is 60 hz, that is 60 iterations/second, then
    speedFactor = 1 means
     speed = 1 pixel / iteration, or 60 pixels/second.
    In general speedFactor = speed/screen's frequency.
    For example if you prefer whole note = 1 sec representing by 90 pixels,
    it means speed = 90 pixels/sec, which needs speedFactor = 1.5.*/

    float speedFactor = 1.3f;
    float a{ 200.0f };
    float b{ 750.0f };
    int lineHeight2{ 500 };
    int lineHeight1{ 500 };
    int lastTime{ 0 };

    while (!glfwWindowShouldClose(window))
    {
        if (glfwGetTime() >= lastTime)
        {
            lineHeight2 = rand() % 100 + 650;
            lineHeight1 = ((lineHeight2 - 700) * 5) + 400;
            lastTime = lastTime + 1;
        }

        x = x - speedFactor;

        // Draw gears
        render_loop(a, b, x, lineHeight2, lineHeight1);
        // Swap buffers
        glfwSwapBuffers(window);
        glfwPollEvents();
    }
    // Terminate GLFW
    glfwTerminate();

    // Exit program
    exit(EXIT_SUCCESS);
}

Wednesday, May 13, 2020

General formulas for the equal-tempered interval

It is copied from wikipedia's article "Equal temperament", now in "View history" section.

Background theory

Basic concepts:
From logarithms: The logarithm of a number is the exponent to which another fixed number, the base, must be raised to produce that number. For example: (98)2 = 8164. The logarithm of 8164 to base 98 is 2, or
To compare logarithms, we use some standard base, e.g. 10 or e. Usually, in mathematical notation "log" means "logarithm with base 10". For example: log 98 = log 1.125 = 0.0511525224473813, and, log (98)2 = log 8164 = log 1.265625 = 0.1023050448947626. The latter logarithm divided by the former yields 2, the exponent as above.
From physics: For a given velocity of the wave on the string, length and frequency of the sound produced from a string when it is plucked, are inverse proportional. For the same tension in the string, and the same string’s linear density, the velocity of the wave on the string is the same.
From acoustics: Pitch is the distinctive quality of a sound, dependent primarily on the [fundamental] frequency of the sound waves produced by its source.[45]
Psychoacoustical law: It was known to Pythagoras[46][47] and perhaps to ancient Chinese philosophers, that our subjective sensation of the difference (interval) in pitch (tone) is inversely proportional to the exponent of the ratio of string (chord) lengths. In modern terms, it is proportional to the logarithm of the ratio of fundamental frequencies of sounds produced by two different lengths of the string.
Example
Consider two pairs of a string's lengths: 90 cm, 80 cm, and, 81 cm, 64 cm. Our subjective sensation of pitch difference (interval) between the sounds produced by string's lengths 81 cm and 64 cm, the ratio 8164 = (98)2, is double that between 90 cm and 80 cm, the ratio 9080 = 98 (see above, From logarithms and Psychoacoustical law). However the pitch of sound from the 90 cm is lower than that from the 80 cm, and from the 81 cm is lower than that from the 64 cm. Now consider two pairs of fundamental frequencies (from other lengths): 90 Hz/80 Hz and 81 Hz/64 Hz. Our subjective sensation of the ratio 8164 = (98)2 is also twice that of 9080 = 98, but the pitch of 90 Hz is higher than that of 80 Hz, and that of 81 Hz higher than that of 64 Hz. The greater the string's length, the lesser the fundamental frequency, and the lower the pitch.
Let A and B be two periodic sounds with stable fundamental frequencies (stimuli) ffreqA and ffreqB. Then, the ratio ffreqAffreqB is called the ratio interval. Our subjective sensation of the interval between these sounds is proportional to log ffreqAffreqB.
Let C and D be two other periodic sounds with stable fundamental frequencies: ffreqC and ffreqD. Their ratio interval is ffreqCffreqD. Let
(see above, logarithms, second example, and below, main formula for d = 1). Then
Our subjective sensation of the sound interval A/B (represented by log ffreqAffreqB), is proportional, by proportion m, to our subjective sensation (represented by log ffreqCffreqD) of the sound interval C/D. The proportion m is magnitude, and log ffreqCffreqD the unit for the tempered interval A/B.
Indeed, if, in the last equation, we multiply m by d and divide log ffreqCffreqD by d, that is
md
then md is the magnitude and
is the unit, as with every quantity, e.g. length L = 5 m. If the magnitude 5 is multiplied by 100, and m (certain length, serving as unit) divided by 100 (that is, cm), then L = 500 cm. When ffreqCffreqD = 2 and d = 1200, unit is cent (one 1200th of octave)."
Let t(r) be the tempered interval of ratio interval r. Then we have:
where m is magnitude and log bd is the unit of tempered interval. The main formula below is obtained from this one.
In the example 2 of the main formula, it is
which means our subjective sensation of the ratio interval 32 (a perfect fifth) is 3.442 times our subjective sensation of the ratio interval 98 (a major second). The number 3.442 is the magnitude, and the tempered interval t(98) = log 98 is the unit of tempered interval t(32) = log 32. Multiplying by 100 yields 344.2 hundredths of the tempered interval t(98).
Note: Pitch depends to a lesser degree on the sound pressure level (loudness, volume) of the tone, especially at frequencies below 1,000 Hz and above 2,000 Hz. The pitch of lower tones gets lower as sound pressure increases. For instance, a tone of 200 Hz that is very loud seems one semitone lower in pitch than if it is just barely audible. Above 2,000 Hz, the pitch gets higher as the sound gets louder. See Pitch and frequency (last paragraph, "
Pitch depends to a lesser degree...").

Main formula

where m is the magnitude of tempered interval, r the ratio interval, b the base interval (2 for octave) and d the divisor of base interval (1200 for cents).
Here are some examples of the formula:
  1. Ratio interval r = 32, and unit cent (b = 2 and d = 1200), then m = 701.955... and t(32) = 701.955 cents.
  2. Ratio interval r = 32, and unit hundredth of the whole tone 98 (b = 98 and d = 100), then m = 344.247... and t(32) = 344.247 hundredths of the whole tone.
  3. Ratio interval r = 2 (octave) and unit hundredth of the whole tone 98 (b = 98 and d = 100), then m = 588.4949... and t(2) = 588.4949 hundredths of the whole tone.
  4. Ratio interval r = 2 (octave), and unit hundredth of Pythagorean limma (b = 256243 and d = 100), then m = 1330 and t(2) = 1330 hundredths of Pythagorean limma.
  5. Ratio interval r = 800729 (idiosyncratic of Byzantine music), and unit tenth of Pythagorean comma (b = 531441524288 and d = 10), then m = 68.58... and t(800729) = 68.58 tenths of Pythagorean comma.
  6. Ratio interval r = 256243 (Pythagorean limma), and unit hundredth of the fourth (b = 43 and d = 100), then m = 18.1158... and t(r) = 18.1158 hundredths of the fourth.
  7. Ratio interval r = 98 (whole tone), and unit hundredth of the fourth (b = 43 and d = 100), then m = 40.942... and t(98) = 40.942 hundredths of the fourth.
  8. Ratio interval r = 1.359 (1 plus random decimal 359), and unit cent (b = 2 and d = 1200), then m = 531.05... and t(1.359) = 531.05 cents.
  9. Ratio interval r = 0.052 (random decimal part), and unit cent (b = 2 and d = 1200), then m = -5118 and t(0.052) = −5118 cents.
  10. Ratio interval r = 1, and unit cent (b = 2 and d = 1200), then m = 0 and t(1) = 0 cents.
  11. Ratio interval r = 2 (octave), and unit thousandth of tempered 10 (b = 10 and d = 1000), then m = 301.03 and t(2) = 301.03 1000ths of t(10).
  12. With natural logarithm (ln): Ratio interval r = 2 (octave) and unit thousandth of tempered e (b = e = 2.71828... and d = 1000), then m = 693 1000ths of t(e) with natural logarithm.
Remarks:
1. For 11th example: log(10) = 1, so log(b) (the unit) becomes 1 and is ignored. So we could call it "absolute" or "abstract" unit with log.
2. For 12th example: ln(e) = 1, so ln(b) (the unit) becomes 1 and is ignored. So we could call it "absolute" or "abstract" unit with ln.

Derived formula

where r, b, d, m are as in the main formula, above. This can be obtained by taking the exponent of each side in the main formula.
  1. When b = 2 (octave), d = 1200, m = 100 (tempered semitone), then r = 1.0594630943592953, ratio interval of tempered semitone.
  2. When b = 9 8 (whole tone), d = 100, m = 344.247..., then r = 1.5 = 32 (perfect fifth). Compare example 2 of main formula.
  3. When b = 256243, d = 100, m = 226, then r = 1.125 = 98 (whole tone).
  4. When b = 256243, d = 100, m = 1330, then r = 2 (octave). Compare example 4 of main formula.
  5. When b = 256243, d = 100, m = 655 (random number), then r = 1.406859...
  6. When t(r) = −351 (random) cents (b = 2, d = 1200), then r = 0.816485969595...
  7. Let note NH = 250 Hz, and we need some note's frequency NH plus 98, plus 800729, minus (descending diesis) 13 of t(256243) (tempered Pythagorean lemma). Calculation is following:
Desired frequency = 250 Hz × 98 × 800729 × (256243)13 = 250 Hz × 1.125 × 1.097393681 × 0.982778 = 303.3 Hz.
8. When b = 10, d = 1000, m = 301.03, then r = 2 (octave).

Notes

  • The numbers r, m, b, d could be random under following conditions:
    1. The magnitude (m) can be any real number.
    2. The ratio interval (r) must be a positive number: r > 0.
    3. The base interval (b) must be greater than unity: b > 1.
    4. The divisor (d) must be a natural number: d = 1, 2, 3, ...
  • The relations between r and m are:
    • When 0 < r < 1, then m < 0.
    • When r = 1, then m = 0.
    • When r > 1, then m > 0.
as with logarithms in general.