Is filterbank removable?

This topic has 1 reply, 2 voices, and was last updated 5 years, 3 months ago by Simon King.

Viewing 1 reply thread

Author

Posts
- November 17, 2020 at 13:33 #13103
  Nian S
  Student
  The filterbank may constrain the representative ability of MFCCs, what if we reduce the filterbank and directly calculate the MFCCs?
  
  We can still use the first 12 coefficients as the features for each frame.
  
  I understand that the filterbank is designed according to the Mel scale, which is from the aspect of human hearing. But if we remove it, we might allow MFCCs to perform better. So, can we simply remove the filterbank and get MFCCs directly from the DFT of the sound?
- November 17, 2020 at 14:54 #13106
  Simon King
  Professor
  Yes – this is a very reasonable proposition. Like many good ideas, it has been tried. Here’s a paper from Tokuda (most famous for speech synthesis) et al on what they call Mel-Generalized Cepstral Analysis – this also shows the relationship between the cepstrum and LPC analysis. (This is well-beyond the scope of the Speech Processing course!)
Author

Posts

Viewing 1 reply thread

You must be logged in to reply to this topic.