- This topic has 1 reply, 2 voices, and was last updated 8 years, 10 months ago by .
Viewing 1 reply thread
Viewing 1 reply thread
- You must be logged in to reply to this topic.
› Forums › Speech Synthesis › The front end › Letter-to-sound: Vowel Pronunciation
I’m wondering how Festival makes decisions for vowel pronunciation. Specifically I’m curious about word-final E; it often indicates pronunciation for penultimate or middle vowels (i.e. fin vs. fine). Does Festival have built-in rules which make predictions for these types of vowels? I assume it does but I would like some clarification on how it works and what it looks for.
For words not in the dictionary, the letter-to-sound model (a classification tree) is used to predict the pronunciation. For each letter in the word, the classification tree predicts the phoneme (or epsilon, or two phonemes).
The predictees are the letter currently being considered and some context around that (e.g., +/- 3 letters).
Let’s assume that your example word “fine” is not in the lexicon. When predicting the sound for the letter “i” the predictees will be:
null null f i n e null
so we can see that the word-final “e” is one of the predictors, and so is available to the classification tree when predicting the sound of the letter “i”. For your other example word “fin” the predictees will be
null null f i n null null
and since the predictees are different, the classification tree is able to separate the two cases using the question
Is the next-next letter = “e”
which has the answer YES for “fine” and NO for “fin”
Some forums are only available if you are logged in. Searching will only return results from those forums if you log in.
Copyright © 2024 · Balance Child Theme on Genesis Framework · WordPress · Log in