- This topic has 1 reply, 2 voices, and was last updated 3 years, 9 months ago by .
Viewing 1 reply thread
Viewing 1 reply thread
- You must be logged in to reply to this topic.
› Forums › Automatic speech recognition › Features › Is filterbank removable?
The filterbank may constrain the representative ability of MFCCs, what if we reduce the filterbank and directly calculate the MFCCs?
We can still use the first 12 coefficients as the features for each frame.
I understand that the filterbank is designed according to the Mel scale, which is from the aspect of human hearing. But if we remove it, we might allow MFCCs to perform better. So, can we simply remove the filterbank and get MFCCs directly from the DFT of the sound?
Yes – this is a very reasonable proposition. Like many good ideas, it has been tried. Here’s a paper from Tokuda (most famous for speech synthesis) et al on what they call Mel-Generalized Cepstral Analysis – this also shows the relationship between the cepstrum and LPC analysis. (This is well-beyond the scope of the Speech Processing course!)
Some forums are only available if you are logged in. Searching will only return results from those forums if you log in.
Copyright © 2024 · Balance Child Theme on Genesis Framework · WordPress · Log in