Module 6 – The Viterbi algorithm

This topic has 1 reply, 2 voices, and was last updated 2 years, 7 months ago by Simon.

Viewing 1 reply thread

Author

Posts
- October 31, 2022 at 09:08 #16176
  Rebecka N
  Student
  The equation for the estimation of the target and join costs in waveform generation is calculated as the minimum possible value over the sum of sub-costs (argmin). The task of finding the best sequence of diphones in the database can be thought of as a task for a Hidden Markov Model, where the observations are the desired states and the database diphones are the hidden states. We can therefore use the Viterbi algorithm to find this optimal sequence.
  I’m wondering about the implementation of the Viterbi algorithm in this case: I know the Viterbi algorithm as a “dynamic programming algorithm for obtaining the maximum a posteriori probability estimate of the most likely sequence of hidden states” (Wikipedia), ie finding the maximum value of the previous column in the Viterbi matrix and multiplying this value with the values in each cell of the current column. But here it seems that we would take the minimum value from the previous column and add it to the sum as per the argmin equation for target and join costs?
  Is the Viterbi algorithm just the method for moving through a matrix given a task defined as having hidden states and observations or is finding the maximum posterior probability key in its definition and implementation and I’m missing a step where we’d convert the values from our target- join-cost equation into another form?
- November 1, 2022 at 15:43 #16208
  Simon
  Professor
  We’ll be covering the Viterbi algorithm in the next section of Speech Processing, about Automatic Speech Recognition. In Module 7, we will encounter the general algorithm of Dynamic Programming in the method known as Dynamic Time Warping. Later, we will see another form of Dynamic Programming, called the Viterbi algorithm, applied to Hidden Markov Models. So, wait for those parts of the course, then ask your question again.
  
  Your questions about whether to take the maximum or minimum relate to whether we are maximising the total (which is what we would do if it was a probability) or minimising it (which is what we would do it if was a distance).
  
  Regarding your questions about whether to multiply or sum: if we are working with probabilities, we multiply. If we are working with distances or costs, we sum (and in fact, we will end up working with log probabilities, which we will sum).
Author

Posts

Viewing 1 reply thread

You must be logged in to reply to this topic.

Module 6 – The Viterbi algorithm

Search the forums

Note

Latest Activity

Search the forums

Speech Synthesis