› Forums › Speech Processing – Live Q&A Sessions › Module 6 – The Viterbi algorithm
- This topic has 1 reply, 2 voices, and was last updated 2 years, 1 month ago by Simon.
-
AuthorPosts
-
-
October 31, 2022 at 09:08 #16176
The equation for the estimation of the target and join costs in waveform generation is calculated as the minimum possible value over the sum of sub-costs (argmin). The task of finding the best sequence of diphones in the database can be thought of as a task for a Hidden Markov Model, where the observations are the desired states and the database diphones are the hidden states. We can therefore use the Viterbi algorithm to find this optimal sequence.
I’m wondering about the implementation of the Viterbi algorithm in this case: I know the Viterbi algorithm as a “dynamic programming algorithm for obtaining the maximum a posteriori probability estimate of the most likely sequence of hidden states” (Wikipedia), ie finding the maximum value of the previous column in the Viterbi matrix and multiplying this value with the values in each cell of the current column. But here it seems that we would take the minimum value from the previous column and add it to the sum as per the argmin equation for target and join costs?
Is the Viterbi algorithm just the method for moving through a matrix given a task defined as having hidden states and observations or is finding the maximum posterior probability key in its definition and implementation and I’m missing a step where we’d convert the values from our target- join-cost equation into another form? -
November 1, 2022 at 15:43 #16208
We’ll be covering the Viterbi algorithm in the next section of Speech Processing, about Automatic Speech Recognition. In Module 7, we will encounter the general algorithm of Dynamic Programming in the method known as Dynamic Time Warping. Later, we will see another form of Dynamic Programming, called the Viterbi algorithm, applied to Hidden Markov Models. So, wait for those parts of the course, then ask your question again.
Your questions about whether to take the maximum or minimum relate to whether we are maximising the total (which is what we would do if it was a probability) or minimising it (which is what we would do it if was a distance).
Regarding your questions about whether to multiply or sum: if we are working with probabilities, we multiply. If we are working with distances or costs, we sum (and in fact, we will end up working with log probabilities, which we will sum).
-
-
AuthorPosts
- You must be logged in to reply to this topic.