› Forums › Automatic speech recognition › Hidden Markov Models (HMMs) › Token passing
- This topic has 3 replies, 3 voices, and was last updated 3 years, 8 months ago by Simon.
-
AuthorPosts
-
-
December 10, 2019 at 13:35 #10542
From the videos, I understood that token passing is a generative model where tokens generate observations and explore multiple partial paths through the lattice in parallel. But in recognition, observations do not need to be generated, since they are given to us. So my question is do tokens actually generate observations at recognition time? And if so is it correct to assume that these observations are the given observations whose probabilities we just have to “look up” in the Gaussian pdfs of the current state? Or do tokens not have to generate observations at all?
-
December 10, 2019 at 13:56 #10543
Token passing is an algorithm, not a model.
Tokens generate the given observations and – in doing so – we compute the probability of that observation being generated from the state’s pdf. Yes, we just “look up” that probability.
-
December 4, 2020 at 14:11 #13379
I’m still a bit confused about token passing. I’ve read Young et al. (1989). Token Passing: a Simple Conceptual Model for Connected Speech Recognition Systems.
1. Why is it called model here?
2. In the paper, it’s mentioned that calculating the probability matrix elements column by column is the basis of Godin and Lockwood paper (as shown in picture 1). But tokening passing leads to much simpler and more powerful generalisation (as shown in picture 1). But what’s the difference of these two methods here? I understand that they both calculate the argmax probability and drop the others. Or is tokening pass more powerful in itergrated recognition network instead of HMM alone?Attachments:
You must be logged in to view attached files. -
December 5, 2020 at 15:39 #13420
Young et all call token passing a “conceptual model” by which they mean a way to think about dynamic programming as well as to implement it.
The two methods in Figures 2 and 3 of the paper are equivalent – they are both computing the most likely alignment (for DTW) or state sequence (for HMMs).
The power and generality of token passing comes into play when we construct more complex models with what the paper calls a “grammar” but we could more generally call a language model. Implementing the algorithm on a matrix (we could also call this a “grid”) is just fine for an isolated word model, but quickly becomes very complicated and messy for connected words with an arbitrary language model. In contrast, token passing extends trivially to any language model, provided it is finite state (e.g., an N-gram, or the hand-crafted grammar from the digit recogniser exercise).
-
-
AuthorPosts
- You must be logged in to reply to this topic.