- This topic has 1 reply, 2 voices, and was last updated 8 years, 7 months ago by .
Viewing 1 reply thread
Viewing 1 reply thread
- You must be logged in to reply to this topic.
› Forums › Speech Synthesis › Unit selection › Diphones and high level features
Why family of units which have joins in the middle of phones are thought to produce better joins?
On the bottom of page 494, the author was talking about high level features and its complex influence. He said that could lead to a situation where the high level features are included in addition to the F0 contour. What exactly does this mean? Are there any examples?
The reasons for making joins in the middle of phones, rather than at phone boundaries, was covered in Speech Processing. The main reason is that this is a more acoustically-stable position, and further away from the effects of co-articulation. Think of diphones as “units of co-articulation” which go from one stable mid-phone position to the next.
Your other point relates to what Taylor says on page 483: “These high-level features are also likely to influence voice quality and spectral effects and if these are left out of the set of specification features then their influence cannot be used in the synthesiser. This can lead to a situation in which the high-level features are included in addition to the F0 contour.”
I’ll touch on this in the lecture.
Some forums are only available if you are logged in. Searching will only return results from those forums if you log in.
Copyright © 2024 · Balance Child Theme on Genesis Framework · WordPress · Log in