Signal processing for speech synthesis

Before moving on to parametric speech synthesis, we need to learn more about signal processing. In particular, how can we represent speech as a set of parameters that are suitable for statistical modelling?
  • F0 estimation

    A key parameter in any parametric representation of speech is the fundamental frequency, F0. Estimating it from speech is not trivial: we need an F0 estimation algorithm, often called a "pitch tracker".

  • Vocoding

    In order to model speech, we need a parametric representation of it. This might be done using a source filter model, or something more general.