Automatic speech recognition

Automatic speech recognition using Hidden Markov Models and simple language models.

Parameterisation
A usual first step in machine learning is to parameterise the signal (also called "feature extraction") and here we'll make a first attempt at that.
Dynamic Time Warping
This rather old-fashioned method is a great way to understand Dynamic Programming, a very widely-applicable technique.
Probability density functions
Probability density functions can be initially thought of as a kind of distance measure that we learn from the data.
Mel frequency cepstral coefficients
Another common step in machine learning is to use our knowledge to engineer a better parameterisation of the signal.
Hidden Markov Models
Now we can develop a powerful generative model and see it as a generalisation of DTW.
Evaluation
How can we measure the performance of an automatic speech recognition? How many words did it get right or wrong?
Training HMMs
We need to know how to estimate the parameters of our models. Because this is harder to understand than doing recognition, we only tackle this after we understand how to do recognition.
Continuous speech
Another great thing about token passing is that it makes the extension to connected speech almost trivial.
Putting it all together
Now we have all the components, it will be useful to see them all working together.

Related exercise

April 15, 2025	This video was Excellent Difficulty Just right Doing Text-to-Speech
April 15, 2025	This video was Excellent Difficulty Just right What is a Neural Network?
April 14, 2025	This video was Excellent Difficulty Just right Wrap-up
April 13, 2025	This video was Excellent Difficulty My brain hurts HMM speech synthesis, described as context-dependent modelling
April 13, 2025	This video was Excellent Difficulty My brain hurts HMM speech synthesis, described as context-dependent modelling