Jurafsky & Martin – Section 4.3 – Training and Test Sets

As we should already know: in machine learning it is essential to evaluate a model on data that it was not learned from.

Jurafsky & Martin – Section 4.2 – Simple (Unsmoothed) N-Grams

We can just use raw counts to estimate probabilities directly.

Jurafsky & Martin – Section 4.1 – Word Counting in Corpora

The frequency of occurrence of each N-gram in a training corpus is used to estimate its probability.

Jurafsky & Martin – Chapter 4 – N-Grams

A simple and effective way to model language is as a sequence of words. We assume that the probability of each word depends only on the identity of the preceding N-1 words.

Jurafsky & Martin – Section 9.6 – Search and Decoding

Important material on efficiently computing the combined likelihood of the acoustic model multiplied by the probability of the language model.

Jurafsky & Martin – Section 9.8 – Evaluation

In connected speech, three types of error are possible: substitutions, insertions, or deletions of words. It is usual to combine them into a single measure: Word Error Rate.

Jurafsky & Martin – Section 9.7 – Embedded training

Embedded training means that the data are transcribed, but that we don’t know the time alignment at the model or state levels.

Young et al: Token Passing

My favourite way of understanding how the Viterbi algorithm is applied to HMMs. Can also be helpful in understanding search for unit selection speech synthesis.

Jurafsky & Martin – Section 9.5 – The lexicon and language model

Simply mentions the lexicon and language model and refers the reader to other chapters.

Taylor – Section 12.3 – The cepstrum

By using the logarithm to convert a multiplication into a sum, the cepstrum separates the source and filter components of speech.

Holmes & Holmes – Chapter 10 – Front-end analysis for ASR

Covers filterbank, MFCC features. The material on linear prediction is out of scope.

Sharon Goldwater: Basic probability theory

An essential primer on this topic. You should consider this reading ESSENTIAL if you haven’t studied probability before or it’s been a while. We’re adding this the readings in Module 7 to give you some time to look at it before we really need it in Module 9 – mostly we need the concepts of conditional probability and conditional independence.