- This topic has 2 replies, 2 voices, and was last updated 8 years, 6 months ago by .
Viewing 2 reply threads
Viewing 2 reply threads
- You must be logged in to reply to this topic.
› Forums › Speech Synthesis › The front end › letter to sound alignment
Hi,
I am trying to build a simple letter to sound model for the Polish TTS.
I used segmented wav and txt files from TUNDRA and created a utts.data file with each entry corresponding to a specific wav file, like below:
( siedemwybranychopowiadan_1a_orkan_00010 “j ę d r e k ś k l a r z m ł o d y g a z d a p o j e c h a ł d o a m e r y k i . “)
( siedemwybranychopowiadan_1a_orkan_00011 “i c a ł e l a t o m i n ę ł o a n i e d a ł z n a k u o s o b i e . “)
The words in each utterance are separated by a white space.
In order to do alignment, mfccs were also created.
Now, I have my data file and mfccs and should be ready to proceed with the alignment, but the do_alignment file will need the following 3 files:
phone_list
phone_substitution
utts.mlf
Having skipped the ‘dictionary’ step I am planning to do the following instead:
phone_list – insert all letters from the Polish alphabet
phone_substitution – leave empty
utts.mlf – letter transcription of each sentence including sp and sil where appropriate.
Can I ask you if I am going anywhere with this ideas, please?
Would be also nice to know if there is any alternative way to do it?
Thanks in advance,
Norbert
Sorry, I meant the letters in each utterance are separated by a white space.
I also changed all non ascii characters, so HTK does not complain.
Yes – you are on the right lines – just assume that each letter is a phoneme.
Some forums are only available if you are logged in. Searching will only return results from those forums if you log in.
Copyright © 2024 · Balance Child Theme on Genesis Framework · WordPress · Log in