- This topic has 1 reply, 2 voices, and was last updated 8 years, 5 months ago by .
Viewing 1 reply thread
Viewing 1 reply thread
- You must be logged in to reply to this topic.
› Forums › Speech Synthesis › Festival › F0 Target
Referring to Lecture 2, slide 62-64:
What constitutes ‘Bad F0’? Is this some kind of outlier, as in duration? There does not seem to be an outlier predictor function for F0.
And speaking of F0, is Festival attempting to target some kind of F0 contour? If so, how can we see this?
There appear to be some ToBI markings in the Utt files. Are we actually using these?
The source code says:
Specifically, if the targ/cand segment type is expected to be voiced, then an f0 of zero is bad (results from poor pitch tracking).
That is, all voiced sounds should have a value for F0, as determined by the pitch tracker.
Festival’s multisyn unit selection engine uses a pure “IFF” target cost function (using Taylor’s terminology). It makes no explicit predictions of any acoustic properties.
The ToBI predictions made by the front end are not used in the target cost.
Some forums are only available if you are logged in. Searching will only return results from those forums if you log in.
Copyright © 2024 · Balance Child Theme on Genesis Framework · WordPress · Log in