May be helpful as a complement to the essential readings.
Holmes & Holmes – Chapter 8 – Template matching and dynamic time warping
Read up to the end of 8.5 carefully. Try to read 8.6 as part of Module 7, but rest assured we will go over the concept of dynamic programming again in Module 9. We recommend you should skim 8.7 and 8.8 because the same general concepts carry forward into Hidden Markov Models (again, we’ll come back to this in Module 9). You don’t need to read 8.9 onwards. Methods like DTW are rarely used now in state of the art systems, but are a good way to start understanding some core ideas.
Holmes & Holmes – Chapter 6 – Phonetic Synthesis by Rule
Mainly of historical interest.
Holmes & Holmes – Chapter 5 – Message synthesis from stored human speech components
Pitch-synchronous overlap-and-add (PSOLA) remains a key technique in speech signal processing.
Holmes & Holmes – Chapter 11 – Improving Speech Recognition Performance
We mitigate the over-simplifications of the model using ever-more-complex algorithms.
Holmes & Holmes – Chapter 10 – Front-end analysis for ASR
Covers filterbank, MFCC features. The material on linear prediction is out of scope.
Handbook of phonetic sciences – Ch 20 – Intro to Signal Processing for Speech (Sections 6-7)
Written for a non-technical audience, this gently introduces some key concepts in speech signal processing. Read sections 6-7.
Handbook of phonetic sciences – Ch 20 – Intro to Signal Processing for Speech (Sections 1-5)
Written for a non-technical audience, this gently introduces some key concepts in speech signal processing. Read sections 1-5 (up to and including ‘Fourier Analysis’).
Handbook of phonetic sciences – Ch 20 – Intro to Signal Processing for Speech
Written for a non-technical audience, this gently introduces some key concepts in speech signal processing.
Furui et al: Fundamental Technologies in Modern Speech Recognition
A complete issue of IEEE Signal Processing Magazine. Although a few years old, this is still a very useful survey of current techniques.
Fitt & Isard – Synthesis of regional English using a keyword lexicon
The source of the Scottish English pronunciations you’ll see in Unilex (and so in the Speech Processing assignment)
Carr – English Phonetics and Phonology: An Introduction – Ch 5 – The Phonemic Principle
Takes you from phonetics (which is about sound) to phonology (which is about mental representation and organisation into categories).