Phone set used for alignment vs synthesis

This topic has 3 replies, 2 voices, and was last updated 3 years ago by Alexandra S.

Viewing 3 reply threads

Author

Posts
- March 30, 2023 at 14:43 #16799
  Alexandra S
  Student
  Hi,
  
  There are two sets of phones used for transcriptions: set “A” that is not marked for stress, and set “B” that is. Set A has more “versions” of the different phones (e.g. oi, oo, or, ou, ow, owr) than set B (e.g. ow and oy). For example, “goat” is transcribed to “g ou t” in set A, but “g ow t” in set B.
  
  If we get a transcription lookup from festival it uses set B. However, the alignment seems to be done with set A (those in the .mlf file).
  
  Which phone set is used for speech generation? I would guess set B, since that is what is used for transcriptions in Festival. Wouldn’t that clash with the phones used at the alignment step?
  
  Thank you
- March 30, 2023 at 18:34 #16802
  Korin Richmond
  Professor
  Yes, you need to use the same front end (ie phone set, lexicon, g2p model etc) for alignment and voice building as you will use at run time for the resulting synthetic voice.
  
  It sounds like you may have built the initial mlf creation step using a different festival voice? (Maybe the default one that’s loaded when you start festival?)
- March 30, 2023 at 19:22 #16803
  Alexandra S
  Student
  I created the .mlf with the General American phone list (phone set “A” in my first post). How can I make sure it’s using that same phone set when it creates the .phones file?
- March 31, 2023 at 11:06 #16804
  Alexandra S
  Student
  I created that phone file with the same phone set (gam) with the command
  festival –script \$FESTVOXDIR/src/promptselect/text2utts.scm
  -eval festival_with_gam.scm
  -level Segment
  -itype data
  -o stories_utts.phones stories_utts.data
  
  I still get different phones in the .mlf file and the .phones file.
Author

Posts

Viewing 3 reply threads

You must be logged in to reply to this topic.

Phone set used for alignment vs synthesis

Search the forums

Note

Latest Activity

Search the forums

Speech Synthesis