- This topic has 1 reply, 2 voices, and was last updated 8 years, 6 months ago by .
Viewing 1 reply thread
Viewing 1 reply thread
- You must be logged in to reply to this topic.
› Forums › Speech Synthesis › Unit selection › Labelling the diphones (not the features, just the phonemes)
I am trying to understand how the actual diphones get labelled (not the features, just the phonemes part). I am thinking of it as a pipelined approach which 1) looks up the recorded sentence’s words in the lexicon to get the pronunciations 2) using EM identifies the phonemes’ edges 3) finds the right place in the middle of the phonemes to then select diphones.
I believe there will be other things which are taken care of, like identify unwanted silences in the middle of the sentence, or carefully select the burst bit of a plosive.
I know this is simplistic, but is this in the right direction?
Thanks
We’ll look at this in detail in the lecture.
Some forums are only available if you are logged in. Searching will only return results from those forums if you log in.
Copyright © 2024 · Balance Child Theme on Genesis Framework · WordPress · Log in