- This topic has 3 replies, 3 voices, and was last updated 7 years, 10 months ago by .
Viewing 3 reply threads
Viewing 3 reply threads
- You must be logged in to reply to this topic.
› Forums › Speech Synthesis › Festival › Unit selection errors
A query regarding a specific error in phoneme selection, for which I cannot identify a cause.
“The book was read by John”
– here, “read” is a past participle (vbn), pronounced “red” not “reed”. In Festival, POS correctly marks “read” as vbn, and in lex.lookup_all, the entry for read vbn has the correct pronunciation “red”, yet the segment “ii” (i.e. the pronunciation “reed”) is selected.
This sounds like a unit selection error. The most likely explanation is that there is a unit in the speech database that is labelled as the vowel in “red” but actually sounds like the vowel in “reed”.
It’s easy to see how that might happen: there was a front-end error during the labelling of the database (e.g., the database utterance contained the word “read” pronounced as “reed” but the front end predicted the phone sequence for the pronunciation “red” and so aligned that phone label with the speech. Automatic labelling works well, but may not always be able to detect that type of error.
The unit selection algorithm is susceptible to mislabelling errors and has only limited ways of detecting them at synthesis time.
Is there a way to identify the source utterance of each unit in the output and listen to it to debug errors like this?
Yes, there is – it’s described here, as part of the coursework for the Speech Synthesis course.
That will tell you the utterance number, and then you can look that up in the list of ARCTIC sentences that was used as the recording script for the cstr_edi_awb_arctic_multisyn
voice.
To listen to the original source sentence, find the appropriate wav file here.
Some forums are only available if you are logged in. Searching will only return results from those forums if you log in.
Copyright © 2024 · Balance Child Theme on Genesis Framework · WordPress · Log in