Includes spectral envelope extraction (cepstrum or LPC), source representation (the residual), pitch tracking and pitch marking.
in Paul Taylor “Text-to-speech synthesis”, 2009, Cambridge University Press, Cambridge, ISBN 0521899273
Taylor - Section 12.3 - The cepstrum
By using the logarithm to convert a multiplication into a sum, the cepstrum separates the source and filter components of speech.
Taylor - Section 12.4 - Linear-Prediction Analysis
An overview of the background and maths behind linear-prediction methods for modelling the vocal tract as a filter.
Taylor - Section 12.7 - Pitch and epoch detection
Only an outline of the main approaches, with little technical detail. Useful as a summary of why these tasks are harder than you might think.
Forum for discussing this reading
Viewing 4 reply threads
Viewing 4 reply threads
- You must be logged in to reply to this topic.