Signal processing for speech synthesis

Before moving on to parametric speech synthesis, we need to learn more about signal processing. In particular, how can we represent speech as a set of parameters that are suitable for statistical modelling?

F0 estimation
A key parameter in any parametric representation of speech is the fundamental frequency, F0. Estimating it from speech is not trivial: we need an F0 estimation algorithm, often called a "pitch tracker".
Vocoding
In order to model speech, we need a parametric representation of it. This might be done using a source filter model, or something more general.

Signal processing for speech synthesis

F0 estimation

Vocoding

Search the forums