Jurafsky & Martin (2nd Edition)

The single most important textbook on this topic. An essential read.

Jurafsky, Daniel, and James H Martin. Speech and Language Processing. Pearson Education UK, 2013.

If you want to buy a copy, your best option is to buy a secondhand copy of the International edition.

Jurafsky & Martin - Chapter 2 - Regular Expressions and Automata
An important technique used widely in NLP. In TTS, it can be applied to tasks such as detecting and expanding non-standard words.
Jurafsky & Martin (3rd Ed) - Hidden Markov models
An overview of Hidden Markov Models, the Viterbi algorithm, and the Baum-Welch algorithm
Jurafsky & Martin - Chapter 3 - Words and Transducers
More...
- Jurafsky & Martin - Section 3.1 - English Morphology
  In speech technology for English, little or no use is made of morphology. But for other languages, it is essential.
- Jurafsky & Martin - Section 3.2 - Finite-State Morphological Parsing
  Automatic morphological decomposition of written words is possible. However, this section does not consider the added complication of deriving a…
- Jurafsky & Martin - Section 3.3 - Construction of a Finite-State Lexicon
  A lexicon can be representing using different data structures (finite state network, tree, lookup table,...), depending on the application.
- Jurafsky & Martin - Section 3.4 - Finite-State Transducers
  FST are a powerful and general-purpose mechanism for mapping ("transducing") an input string to an output string.
- Jurafsky & Martin - Section 3.5 - FSTs for Morphological Parsing
Jurafsky & Martin - Chapter 4 - N-Grams
A simple and effective way to model language is as a sequence of words. We assume that the probability of each word depends only on the identity of the preceding N-1 words.More...
- Jurafsky & Martin - Section 4.1 - Word Counting in Corpora
  The frequency of occurrence of each N-gram in a training corpus is used to estimate its probability.
- Jurafsky & Martin - Section 4.2 - Simple (Unsmoothed) N-Grams
  We can just use raw counts to estimate probabilities directly.
- Jurafsky & Martin - Section 4.3 - Training and Test Sets
  As we should already know: in machine learning it is essential to evaluate a model on data that it was…
- Jurafsky & Martin - Section 4.4 - Perplexity
  It is possible to evaluate how good an N-gram model is without integrating it into an automatic speech recognition. We…
Jurafsky & Martin - Chapter 5 - Part-of-Speech Tagging
For our purposes, only sections 5.1 to 5.5 are needed.
Jurafsky & Martin - Chapter 8 - Speech Synthesis
A good place to start, before moving on to Taylor's book for a more in-depth treatment of this topic.More...
- Jurafsky & Martin (2nd ed) - Section 8.1 - Text Normalisation
  We need to normalise the input text so that it contains a sequence of pronounceable words.
- Jurafsky & Martin (2nd ed) - Section 8.2 - Phonetic Analysis
  Each word in the normalised text needs a pronunciation. Most words will be found in the dictionary, but for the…
- Jurafsky & Martin (2nd ed) - Section 8.3 - Prosodic Analysis
  Beyond getting the phones right, we also need to consider other aspects of speech such as intonation and pausing.
- Jurafsky & Martin - Section 8.4 - Diphone Waveform Synthesis
  A simple way to generate a waveform is by concatenating speech units from a pre-recorded database. The database contains one…
- Jurafsky & Martin - Section 8.5 - Unit Selection (Waveform) Synthesis
  A brief explanation. Worth reading before tackling the more substantial chapter in Taylor (Speech Synthesis course only).
Jurafsky & Martin - Chapter 9 - Speech Recognition
Basic material on automatic speech recognition. A good starting point, but not enough on its own.More...
- Jurafsky & Martin - Chapter 9 introduction
  The difficulty of ASR depends on factors including vocabulary size, within- and across-speaker variability (including speaking style), and channel and…
- Jurafsky & Martin - Section 9.1 - Speech Recognition Architecture
  Most modern methods of ASR can be described as a combination of two models: the acoustic model, and the language…
- Jurafsky & Martin - Section 9.2 - The HMM Applied to Speech
  Introduces some notation and the basic concepts of HMMs.
- Jurafsky & Martin - Section 9.3 - Feature Extraction: MFCCs
  Mel-frequency Cepstral Co-efficients are a widely-used feature with HMM acoustic models. They are a classic example of feature engineering: manipulating…
- Jurafsky & Martin - Section 9.4 - Acoustic Likelihood Computation
  To perform speech recognition with HMMs involves calculating the likelihood that each model emitted the observed speech. You can skip…
- Jurafsky & Martin - Section 9.5 - The lexicon and language model
  Simply mentions the lexicon and language model and refers the reader to other chapters.
- Jurafsky & Martin - Section 9.6 - Search and Decoding
  Important material on efficiently computing the combined likelihood of the acoustic model multiplied by the probability of the language model.
- Jurafsky & Martin - Section 9.7 - Embedded training
  Embedded training means that the data are transcribed, but that we don't know the time alignment at the model or…
- Jurafsky & Martin - Section 9.8 - Evaluation
  In connected speech, three types of error are possible: substitutions, insertions, or deletions of words. It is usual to combine…

This reading is
Very useful		1
Somewhat useful		0
Confusing		0

Jurafsky & Martin (2nd Edition)

Jurafsky & Martin - Chapter 2 - Regular Expressions and Automata

Jurafsky & Martin (3rd Ed) - Hidden Markov models

Jurafsky & Martin - Chapter 3 - Words and Transducers

Jurafsky & Martin - Section 3.1 - English Morphology

Jurafsky & Martin - Section 3.2 - Finite-State Morphological Parsing

Jurafsky & Martin - Section 3.3 - Construction of a Finite-State Lexicon

Jurafsky & Martin - Section 3.4 - Finite-State Transducers

Jurafsky & Martin - Section 3.5 - FSTs for Morphological Parsing

Jurafsky & Martin - Chapter 4 - N-Grams

Jurafsky & Martin - Section 4.1 - Word Counting in Corpora

Jurafsky & Martin - Section 4.2 - Simple (Unsmoothed) N-Grams

Jurafsky & Martin - Section 4.3 - Training and Test Sets

Jurafsky & Martin - Section 4.4 - Perplexity

Jurafsky & Martin - Chapter 5 - Part-of-Speech Tagging

Jurafsky & Martin - Chapter 8 - Speech Synthesis

Jurafsky & Martin (2nd ed) - Section 8.1 - Text Normalisation

Jurafsky & Martin (2nd ed) - Section 8.2 - Phonetic Analysis

Jurafsky & Martin (2nd ed) - Section 8.3 - Prosodic Analysis

Jurafsky & Martin - Section 8.4 - Diphone Waveform Synthesis

Jurafsky & Martin - Section 8.5 - Unit Selection (Waveform) Synthesis

Jurafsky & Martin - Chapter 9 - Speech Recognition

Jurafsky & Martin - Chapter 9 introduction

Jurafsky & Martin - Section 9.1 - Speech Recognition Architecture

Jurafsky & Martin - Section 9.2 - The HMM Applied to Speech

Jurafsky & Martin - Section 9.3 - Feature Extraction: MFCCs

Jurafsky & Martin - Section 9.4 - Acoustic Likelihood Computation

Jurafsky & Martin - Section 9.5 - The lexicon and language model

Jurafsky & Martin - Section 9.6 - Search and Decoding

Jurafsky & Martin - Section 9.7 - Embedded training

Jurafsky & Martin - Section 9.8 - Evaluation

Search the forums…

In the forums…