speech processing module 08

Wayland (Phonetics) – Chapter 9 – Hearing

Introduces basic concepts in human hearing – it may be useful to read the bits on decibels/loudness and the Mel and Bark scales.

Taylor – Section 12.3 – The cepstrum

By using the logarithm to convert a multiplication into a sum, the cepstrum separates the source and filter components of speech.

Holmes & Holmes – Chapter 10 – Front-end analysis for ASR

Covers filterbank, MFCC features. The material on linear prediction is out of scope.

Jurafsky & Martin – Section 9.3 – Feature Extraction: MFCCs

Mel-frequency Cepstral Co-efficients are a widely-used feature with HMM acoustic models. They are a classic example of feature engineering: manipulating the extracted features to suit the properties and limitations of the statistical model. Please note: the description of MFCC extraction steps differs somewhat from the standard definition of MFCCs and what is actually implemented in HTK. For the assignment, you should follow the description of MFCC extraction steps from the videos here on speech zone and in the lectures.

Wayland (Phonetics) – Chapter 9 – Hearing

Taylor – Section 12.3 – The cepstrum

Holmes & Holmes – Chapter 10 – Front-end analysis for ASR

Jurafsky & Martin – Section 9.3 – Feature Extraction: MFCCs

Search this site

Posts

Latest Activity

Search the forums