- This topic has 2 replies, 2 voices, and was last updated 8 years, 2 months ago by .
Viewing 2 reply threads
Viewing 2 reply threads
- You must be logged in to reply to this topic.
› Forums › Speech Synthesis › Festival › Synthesising directly from a phone sequence rather than text
Hi,
I’m attempting to perform synthesis from a sequence of phones. I have tried to use the SayPhones method but I could not find enough documentation: the following (and other phone sequences as well), taken from a random paper on the web, only yields an EST error “item is null so has no stress feature”.
festival> (SayPhones ‘(# n o t w @@r k i ng #))
I then stored a sequence of phones into a variable like so (using Utterance Phones):
festival> (set! someutt (Utterance Phones (# h @ l ou #)))
and tried to construct the utterance step by step in a similar fashion to what we had done in the 1st Speech processing assignment, but failed again.
Could you provide some explanation as to how to go about synthesising from a phone sequence, and optionally, how to fine-tune parameters such as length and stress?
Thank you,
Étienne
edit: fixed the link
I’m not sure of the solution to this. Let’s talk in person – is Festival the best framework for you, or should we consider a DNN system?
SayPhones is probably only going to work for a diphone voice, not a Multisyn unit selection voice. Try loading a diphone voice and see if that works. You are going to get monotonic F0 though, I think.
Some forums are only available if you are logged in. Searching will only return results from those forums if you log in.
Copyright © 2024 · Balance Child Theme on Genesis Framework · WordPress · Log in