Speech Processing

From the foundations of speech signals and phonetics, to text-to-speech synthesis and automatic speech recognition.

This course is taught at the University of Edinburgh at advanced undergraduate and Masters levels.

This course has a policy of continual improvement. The modules are in the process of being updated for 2024/25.

Module 0 - getting started
Start here! Gives an introduction to the course, explains how the course is delivered, and describes the computing environment you will need.
Weekly schedule
The weekly schedule shows which module we are covering each week.
Module 1 - Phonetics and Representations of Speech
An introduction to phonetics and how we can visualise speech
Module 2 - Acoustic Phonetics
We can analyze differences in the articulation of vowels and consonants in in terms of acoustic phonetic features
Module 3 - Digital Speech Signals
What are spectrograms really? An introduction to Digital Signal Processing and the Discrete Fourier Transform
Module 4 - the Source-Filter Model
Building on our understanding of the digital signal processing, we look at source-filter model from more of an engineering perspective
Module 5 - speech synthesis - phonemes and the front end
Pronunciation, including letter-to-sound models, and predicting prosody. All these tasks can be done with Classification And Regression Trees (CARTs).
Module 6 - Speech Synthesis - waveform generation and connected speech
Manipulating recorded speech signals to create new utterances.
Intermission
Some notes about the course structure, a look back to what you have learned so far, and what is coming up.
Module 7 - Speech Recognition - Pattern matching
The most basic way to recognise speech is by comparing the speech to be recognised with stored reference examples.
Module 8 - Speech Recognition - Feature engineering
To get the best out of machine learning, we can prepare features that reflect our knowledge of the problem, and suit our chosen model.
Module 9 - Speech Recognition - the Hidden Markov Model
We now replace pattern matching with a generative model that is learned from data.
Module 10 - Speech Recognition - Connected speech & HMM training
HMMs extend easily to connected speech so finally we put everything together to make a complete speech recognition system. We'll also learn how to train an HMM from data.
Milestones
To keep on track, check your progress against these milestones. Try to stay ahead of them if you can.
Marking policy
The policy is positive: it encourages you to attempt all parts of the coursework and exam, and rewards both partially-correct and fully-correct work.

April 15, 2025	This video was Excellent Difficulty Just right Doing Text-to-Speech
April 15, 2025	This video was Excellent Difficulty Just right What is a Neural Network?
April 14, 2025	This video was Excellent Difficulty Just right Wrap-up
April 13, 2025	This video was Excellent Difficulty My brain hurts HMM speech synthesis, described as context-dependent modelling
April 13, 2025	This video was Excellent Difficulty My brain hurts HMM speech synthesis, described as context-dependent modelling

Speech Processing

Module 0 - getting started

Weekly schedule

Module 1 - Phonetics and Representations of Speech

Module 2 - Acoustic Phonetics

Module 3 - Digital Speech Signals

Module 4 - the Source-Filter Model

Module 5 - speech synthesis - phonemes and the front end

Module 6 - Speech Synthesis - waveform generation and connected speech

Intermission

Module 7 - Speech Recognition - Pattern matching

Module 8 - Speech Recognition - Feature engineering

Module 9 - Speech Recognition - the Hidden Markov Model

Module 10 - Speech Recognition - Connected speech & HMM training

Milestones

Marking policy

Search the forums

Speech Processing

In the forums…

Latest video ratings