- This topic has 1 reply, 2 voices, and was last updated 8 years, 6 months ago by .
Viewing 1 reply thread
Viewing 1 reply thread
- You must be logged in to reply to this topic.
› Forums › Speech Synthesis › Unit selection › Acoustic-Space Formulation of Target Function
Taylor 16.4: For the target cost function, instead of using a feature representation (Hunt & Black), an ‘acoustic representation’ can be generated by ‘partial synthesis’. However, this does not produce an actual waveform, but an ‘approximate waveform’ – apparently still abstract, but able to be compared (in ‘perceptual space’) with real candidate units.
What is this representation? (Taylor mentions cepstra: would this be represented as MFCCs?) IF this method has managed to derive some acoustic MFCC-like representation from data, has it not done most of the work (except actual waveform synthesis) of a parametric synthesiser?
Related question: how does this representation relate to HMMs? Taylor mentions that the acoustic representation of a target can match to a distribution of units, but how do we get the target specification for additional states? Do we interpolate the specification to neighbouring targets?
Some forums are only available if you are logged in. Searching will only return results from those forums if you log in.
Copyright © 2024 · Balance Child Theme on Genesis Framework · WordPress · Log in