› Forums › Foundations of speech › Signal processing › Filter Confusion
- This topic has 3 replies, 2 voices, and was last updated 3 years, 9 months ago by Simon.
-
AuthorPosts
-
-
October 21, 2020 at 11:11 #12636
Am getting really confused about at what point we introduce the filter – if we already have pre-recorded diphones and can manipulate F0 and duration using TD-PSOLA, why is the filter still necessary? Is it because we can only manipulate resonances through the filter?
-
October 21, 2020 at 13:54 #12637
The source-filter model is a conceptual model to help us understand speech signals, by creating model of how they are produced.
We can use this model in many different ways, to do many different things with speech signals. Sometimes we actually implement the model directly, other times we use it as motivation and use it indirectly.
A direct application would be to use the model itself to process speech. We would fit the filter to natural speech, thus finding suitable values for the coefficients of the difference equation defining the filter. Then, we might generate synthetic speech at an F0 and duration of our choice by constructing an appropriate impulse train as the excitation signal to input to the filter.
In this direct application, we manipulate source features (F0, duration) by changing the excitation signal. We can also independently manipulate the filter’s resonances (formant frequencies) by adjusting the coefficients of the difference equation. We have total control over the signal.
But we can also use the model indirectly – that is, without ever writing down the difference equation or inputting an excitation signal into an actual filter. TD-PSOLA is an example of this way of using the source-filter model. Instead of using the difference equation to define the filter, we instead represent the filter as its impulse response (which is a waveform). TD-PSOLA only offers partial control over the signal. We cannot use it to manipulate the filter’s resonances because the impulse response is a waveform, and not a set of coefficients in an equation.
-
October 22, 2020 at 14:43 #12659
So the filter in TD-PSOLA is the impulse response of a diphone?
-
October 22, 2020 at 16:16 #12663
Yes, nearly. In TD-PSOLA, our representation of the filter is the impulse response. This is a single pitch period. A diphone is a sequence of pitch periods.
Why can’t we store a single pitch period to represent the filter for a complete diphone?
-
-
AuthorPosts
- You must be logged in to reply to this topic.