› Forums › Speech Synthesis › Festival › vowel reduction at runtime
- This topic has 4 replies, 2 voices, and was last updated 8 years, 9 months ago by Simon.
-
AuthorPosts
-
-
March 19, 2016 at 21:23 #2834
I’m trying to synthesise ‘according to the’. I have a perfectly good ‘according to’ in my database, contiguous. However, at runtime, festival is doing this:
id _119 ; name ng ;
id _121 ; name t ;
id _122 ; name uu ; reducable 1 ; fullform uu ; reducedform @ ;
id _124 ; name dh ;for the monophone list. But in the Unit relation (which I can upload is you want but its messy in this window, so just trust me), its clearly looking for the reducedform @, which forces it to jump to another wav file, and then, unfortunately, it gets stuck there because there is no @-dh diphone, so it inserts silence. So, a double whammy of badness. I would like to force festival NOT to choose this reduction, specifically in this case, but in general if need be. I attempted turning off all target costs, but it still prefers some join cost over zero join cost at that join, because its is insisting on the reduced form. Its hard to see where in the pipeline this is happening. Suggestions?
-
March 19, 2016 at 21:30 #2835
Ok, I think i remember, its a ‘post-lexical rule’, which means…its not technically part of the lexical spec, which is why its not considered part of the target spec. ok, fine. according to the manual:
“Our vowel reduction model uses a CART decision tree to predict which syllables should be reduced”. Can I edit this??Or, is there another approach: go into the original utt file and change the ‘uu’ to an ‘@’? Better yet, find an utt that has uu-dh in it and change the uu to ‘@’?
-
March 19, 2016 at 21:48 #2836
there’s this:
(postlex_unilex_vowel_reduction utt)
“Perform vowel reduction based on unilex specification of what can be reduced.”but not clear how to use it.
-
March 21, 2016 at 15:20 #2839
That is indeed the function that is doing the vowel reduction at synthesis time. You could try modifying (redefining) it, to prevent any vowel reduction at all, but you would then almost certainly make a lot of other test sentences worse.
-
-
March 21, 2016 at 15:19 #2838
The problem you have found is a mismatch between the vowel reduction rules used at synthesis time, and the method for identifying reduced vowels in the database. It is impossible(*) for these to perfectly match, because the speaker may not reduce exactly the vowels that the front-end predicts should be reduced.
Editing the labels on the database sounds like the solution in this case, assuming that the corrected label is a closer match to the speech. Of course, whilst you might improve this particular test sentence, you may make others worse.
(*) unless you can get the speaker to read out phonetic transcriptions of the sentences?
-
-
AuthorPosts
- You must be logged in to reply to this topic.