› Forums › Speech Synthesis › F0 estimation and epoch detection › Low Pass Filter for isolating F0 in Epoch Detection
- This topic has 1 reply, 2 voices, and was last updated 2 years, 9 months ago by Simon.
-
AuthorPosts
-
-
March 6, 2022 at 13:07 #15750
In the video on Epoch Detection for Module 6 it says we can use a low pass filter in an attempt to remove all frequencies but F0 to subsequently find peak crossings. I’m particularly confused about the domains (time or frequency) in which each step of epoch detection happens (especially the low pass filter step). Is the following high level process correct:
1) first we get our waveform from the time into the frequency domain with DFT
2) next we apply low pass filter in the frequency domain (?) to isolate the low frequencies – and this is as simple as a multiplication of the signal in frequency domain by another wave
3) we convert the simplified signal back from frequency domain to time domain (using inverse DFT?)
4) take the derivative of simplified waveform in time domain and find zero crossingsAlso, why does the need for postprocessing (correcting the time offset) arise in the first place? Is this related to our simplification of the signal with the low pass filter?
-
March 6, 2022 at 14:21 #15751
The filtering can be done directly in the time domain. It’s easiest to describe and understand filtering in the frequency domain, but the filter can be implemented as a direct operation on the speech waveform samples.
Filter design is an entire subject on its own and out of scope. But we can understand one very simple form of low-pass filter that is easy to implement: a moving average. If we take a moving average of a speech waveform, that will smooth out the smaller details (i.e., remove the higher frequencies).
The time offset is because the main peak in the speech waveform may not align with the peak that we found in the low-pass-filtered version. That could be for two separate reasons. The first is to do with the phases of the many different harmonic frequencies making up the speech signal. The second is the phase response of the low-pass filter (e.g., the filter introduces a time delay).
-
-
AuthorPosts
- You must be logged in to reply to this topic.