- This topic has 1 reply, 2 voices, and was last updated 8 years, 4 months ago by .
Viewing 1 reply thread
Viewing 1 reply thread
- You must be logged in to reply to this topic.
› Forums › Speech Synthesis › Merlin › MFCCs?
I am confused about the relationship between MFCCs and output features we use (mpc, lf0, vuv, bap). It seems to me that the vocoder extracts these features (mpc, lf0, uvu, bap) from the waveforms directly, and reconstruct waveforms using these features.
So why do we need MFCCs during forced-alignment? Are the output features extracted from MFCCs of the waveforms?
Hi Hanzhang,
The vocoder extracts the parameters: spectral envelope, f0 contour, and aperiodicities. Then, you can transform them into MGCs (or MCEP), lf0, and bap, respectively.
Do not confuse MGCs (or MCEP) with MFCCs, they are different features. The forced-alignment process uses MFCCs to recognize the phoneme structures of the data.
Thanks,
Felipe
Some forums are only available if you are logged in. Searching will only return results from those forums if you log in.
Copyright © 2024 · Balance Child Theme on Genesis Framework · WordPress · Log in