Phone HMM within embedded training

This topic has 3 replies, 2 voices, and was last updated 4 years, 5 months ago by Simon.

Viewing 3 reply threads

Author

Posts
- December 5, 2020 at 22:34 #13451
  Siqi S
  Student
  In embedding training, would one specific phone HMM be shared across words if the same phone occurred in multiple words in one sentence? Including both the A matrix and the Gaussian parameters?
- December 5, 2020 at 22:44 #13456
  Simon
  Professor
  Yes, that’s correct – shared across all occurrences of that phone in the entire training data. We train one phone model on all available examples.
  
  The implementation of this involves performing the E step for all data, “accumulating” (i.e., adding up) the necessary statistics (in fact we “accumulate” all the numerators and denominators of the M step equations, but don’t yet divide one by the other). Once all of that has been accumulated across the training data, the M step updates the model’s parameters. There is one numerator accumulator and one denominator accumulator for every individual model parameter (e.g., for the mean of each and every Gaussian).
  
  (This implementation detail is not examinable for Speech Processing.)
- December 7, 2020 at 10:49 #13476
  Siqi S
  Student
  Thank you for the clarification. During decoding, in hierarchical full graph, some phone HMM also occurs in several word (e.g. [ow] in “zero” and “oh” in Fig 9.22 in Jurafsky’s book). Are the A matrix and Gaussian parameters also shared among the occurrences?
- December 7, 2020 at 12:44 #13483
  Simon
  Professor
  Yes – there is only one model per phone (in the case of monophone models).
Author

Posts

Viewing 3 reply threads

You must be logged in to reply to this topic.

Phone HMM within embedded training

Search the forums

Note

Latest Activity

Search the forums

Speech Synthesis