Finish

That was a lot of material to cover, and some of it is conceptually quite hard, so don’t worry if it takes you several attempts to understand it. The good news is that, once you understand the HMM for isolated whole word recognition, it’s relatively easy to extend it to continuous speech or sub-word units.

The following post shows an animation of token passing that will help you understand the algorithm:

Token passing

Now that you understand token passing, here’s an early Christmas present for you: a token passing game! You need the board and the Probability Density Functions (one for each state in the model). You need to compute the most likely state sequence for each of the two provided observation sequences. The rules are: use the Viterbi algorithm!

What you should know

Hidden Markov Models (HMMs)
- What are the components of an HMM (i.e. states (‘nodes’), transitions (‘arcs’))? How do they relate to the visualisations of HMMs?
- What do the state transition probabilities represent?
- What do the state emission probabilities represent?
The Viterbi algorithm: this is only examinable at a conceptual level, you won’t be examined on the maths.
- What does the time-state lattice (also called a trellis) represent for an observed sequence and an HMM?
- What do you multiply together to find the probability of a specific alignment between states and observations (i.e. a specific path probability)?
- What’s the naive way to determine the most probable path (i.e. alignment of states to observations) given an observation sequence and an HMM?
- How does the Viterbi algorithm do this more efficiently?
- What’s the relationship between Viterbi and DTW?
How does the token passing algorithm relate to the Viterbi algorithm? Why is token passing convenient for ASR?
Fitting the Gaussian to data: be able to recognise the correct formulae for calculating mean and variance of some data points.

Key Terms

Hidden Markov Model (HMM)
Markov property
state
transition
state transition probability
transition probability matrix
emission
state emission probability
state sequence
state alignment
observation sequence
path
probability
likelihood, maximum likelihood
argmax
prior probability
conditional probability
conditional independence
lattice, trellis
Viterbi algorithm
token passing algorithm
Gaussian probability density function
mean
variance
feature vector
decoding