Taylor – Section 12.4 – Linear-Prediction Analysis

An overview of the background and maths behind linear-prediction methods for modelling the vocal tract as a filter.

Taylor – Section 12.7 – Pitch and epoch detection

Only an outline of the main approaches, with little technical detail. Useful as a summary of why these tasks are harder than you might think.

Jurafsky & Martin – Section 8.5 – Unit Selection (Waveform) Synthesis

A brief explanation. Worth reading before tackling the more substantial chapter in Taylor (Speech Synthesis course only).

Jurafsky & Martin – Section 8.4 – Diphone Waveform Synthesis

A simple way to generate a waveform is by concatenating speech units from a pre-recorded database. The database contains one recording of each required speech unit.

Taylor – Section 10.2 – Digital signals

Going digital involves approximations in the way an original analogue signal is represented.

Taylor – Section 10.1 – Analogue signals

It’s easier to start by understanding physical signals – which are analogue – before we then approximate them digitally.

Holmes & Holmes – Chapter 6 – Phonetic Synthesis by Rule

Mainly of historical interest.

Holmes & Holmes – Chapter 5 – Message synthesis from stored human speech components

Pitch-synchronous overlap-and-add (PSOLA) remains a key technique in speech signal processing.

Ladefoged (Elements) – Chapter 11 – Digital filters and LPC analysis

A brave attempt to use ‘long hand’ to spell out how LPC analysis works, but not a recommended reading.

Ladefoged (Elements) – Chapter 10 – Fourier analysis

An attempt to explain Fourier analysis. Although chapters 1-9 are great, I actually do not recommend chapter 10.