Jurafsky & Martin – Section 9.6 – Search and Decoding

Important material on efficiently computing the combined likelihood of the acoustic model multiplied by the probability of the language model.

Jurafsky & Martin – Section 9.5 – The lexicon and language model

Simply mentions the lexicon and language model and refers the reader to other chapters.

Jurafsky & Martin – Section 9.4 – Acoustic Likelihood Computation

To perform speech recognition with HMMs involves calculating the likelihood that each model emitted the observed speech. You can skip 9.4.1 Vector Quantization.

Jurafsky & Martin – Section 9.3 – Feature Extraction: MFCCs

Mel-frequency Cepstral Co-efficients are a widely-used feature with HMM acoustic models. They are a classic example of feature engineering: manipulating the extracted features to suit the properties and limitations of the statistical model.

Jurafsky & Martin – Section 9.2 – The HMM Applied to Speech

Introduces some notation and the basic concepts of HMMs.

Jurafsky & Martin – Section 9.1 – Speech Recognition Architecture

Most modern methods of ASR can be described as a combination of two models: the acoustic model, and the language model. They are combined simply by multiplying probabilities.

Jurafsky & Martin – Section 8.5 – Unit Selection (Waveform) Synthesis

A brief explanation. Worth reading before tackling the more substantial chapter in Taylor (Speech Synthesis course only).

Jurafsky & Martin – Section 8.4 – Diphone Waveform Synthesis

A simple way to generate a waveform is by concatenating speech units from a pre-recorded database. The database contains one recording of each required speech unit.

Jurafsky & Martin – Section 4.4 – Perplexity

It is possible to evaluate how good an N-gram model is without integrating it into an automatic speech recognition. We simply measure how well it predicts some unseen test data.

Jurafsky & Martin – Section 4.3 – Training and Test Sets

As we should already know: in machine learning it is essential to evaluate a model on data that it was not learned from.

Jurafsky & Martin – Section 4.2 – Simple (Unsmoothed) N-Grams

We can just use raw counts to estimate probabilities directly.

Jurafsky & Martin – Section 4.1 – Word Counting in Corpora

The frequency of occurrence of each N-gram in a training corpus is used to estimate its probability.