- This topic has 1 reply, 2 voices, and was last updated 5 years, 6 months ago by .
Viewing 1 reply thread
Viewing 1 reply thread
- You must be logged in to reply to this topic.
› Forums › Speech Synthesis › Unit selection › Degrading a signal
From the videos, I understand that when we implement unit selection using diphones, then we can use techniques such as TD-PSOLA to manipulate F0 and duration. However, excessive manipulation can introduce perceptible artefacts. You mention in a video that it degrades the signal, can you explain what ‘degrading’ means in this context?
Is it simply inferring a lower quality signal? If so, what does a low quality explicitly mean in the context of speech synthesis?
The terms “degrading” is somewhat informal, but what I mean is that the low-level signal quality is made worse.
This can be contrasted to other types of degradations – for example, that we might get from unit selection: perceptible joins, incorrect co-articulation or bad prosody.
Some forums are only available if you are logged in. Searching will only return results from those forums if you log in.
Copyright © 2024 · Balance Child Theme on Genesis Framework · WordPress · Log in