Ladefoged & Johnson – A course in phonetics – Chapter 8 – Acoustic phonetics

Links the source-filter model to spectrograms and acoustic analysis of speech.

Introduction to the IPA from the Handbook of the International Phonetic Association

Describes the aims of the International Phonetic Alphabet and its various uses.

Practical Phonetics

Videos for the course Practical Phonetics

Normal Speech Articulation

X-ray movies of speech

Seeing Speech

Interactive IPA chart

Taylor – Section 12.7 – Pitch and epoch detection

Only an outline of the main approaches, with little technical detail. Useful as a summary of why these tasks are harder than you might think.

Jurafsky & Martin – Section 8.5 – Unit Selection (Waveform) Synthesis

A brief explanation. Worth reading before tackling the more substantial chapter in Taylor (Speech Synthesis course only).

Furui et al: Fundamental Technologies in Modern Speech Recognition

A complete issue of IEEE Signal Processing Magazine. Although a few years old, this is still a very useful survey of current techniques.

Holmes & Holmes – Chapter 9 – Stochastic Modelling

May be helpful as a complement to the essential readings.

Holmes & Holmes – Chapter 11 – Improving Speech Recognition Performance

We mitigate the over-simplifications of the model using ever-more-complex algorithms.

Jurafsky & Martin – Section 4.4 – Perplexity

It is possible to evaluate how good an N-gram model is without integrating it into an automatic speech recognition. We simply measure how well it predicts some unseen test data.

Jurafsky & Martin – Section 4.3 – Training and Test Sets

As we should already know: in machine learning it is essential to evaluate a model on data that it was not learned from.