› Forums › Foundations of speech › Signal processing › Window length and spectral resolution
- This topic has 3 replies, 3 voices, and was last updated 3 months, 2 weeks ago by Eli J.
-
AuthorPosts
-
-
October 2, 2024 at 22:54 #17952
If I’m understanding correctly, the spectral resolution for Fourier analysis is given by 1/[window size]. However, when I use Praat’s default settings (0.005s window size, which would give a 200 Hz spectral resolution using this formula) and take a spectral slice at a point in time, it looks like the frequencies are about 86 Hz apart. What have I misunderstood here?
Attachments:
You must be logged in to view attached files. -
October 3, 2024 at 10:40 #17954
I think this is because when you select a window in Praat, it doesn’t ONLY select your window, but a bit before and after.
In the videos, it is mentioned that selecting windows with. rectangular ‘edges’ creates sharp changes at beginning and end of your selection, which are artefacts (not in the real wave).
So by default Praat will make your window ‘tapered’ and overlapping (in the shape of a gaussian curve, as per documentation mentioned in lecture) to smooth out these artefacts. -
October 3, 2024 at 17:03 #17958
Yes! The relevant documentation is here:
https://www.fon.hum.uva.nl/praat/manual/Intro_3_7__Configuring_the_spectral_slice.htmlIf you look in the default “Advanced Spectrogram Settings…” (in the Spectrogram drop down) you’ll see that the default setting for Window Shape is “Gaussian”. The documentation says that for a window length of 0.005 s, “If the window shape is Gaussian, Praat will extract a part of the sound that runs from 5 milliseconds before the cursor to 5 ms after the cursor. The spectrum will then be based on a “physical” window length of 10 ms, although the “effective” window length is still 5 ms”. This is because the a Gaussian window shape basically reduces the amplitudes at the edges of the window. it’s a tapered window as discussed in this video: https://speech.zone/courses/speech-processing/module-3-digital-speech-signals/videos-2/short-term-analysis/.
So the number of samples in that window actually corresponds to the number of samples in 0.010 s given a sampling rate of 44100 Hz (the sampling rate of the recording). The length of the window in samples is the 44100 * 0.01 = 441 samples. But that still doesn’t quite get you the analysis frequencies you see in Praat for that specific example.
Praat actually does one more thing that changes the sample size (hence real window length), which is that it uses the Fast Fourier Transform. This is an efficient implementation of the Discrete Fourier Transform that is much faster than the original formulation. The catch is that it only works if the number of input samples is a power of 2. So in this case, the window is padded out with zeros to 512 samples (=2^9).
This is an option if you create a spectrum from the objects menu See the setting “fast” here (though I don’t recommend trying to understand the Fourier transform from the rest of that page!): https://www.fon.hum.uva.nl/praat/manual/Sound__To_Spectrum___.html
Though the use of the Fast Fourier Transform appears to be a fixed setting if you generate the spectral slice from the Sound viewer.
So, if you have 512 samples and a sampling rate of 44100, you get DFT analysis frequencies as multiples of 44100/512 = 86.13281 Hz (which is what Eli observed above!).
The moral of this story, is that for software (like Praat) which has a lot of potential options “under the hood”, you may need to check out the documentation to understand what exactly it’s doing!
-
October 4, 2024 at 19:55 #17964
That makes total sense, thank you both!
-
-
AuthorPosts
- You must be logged in to reply to this topic.