- This topic has 1 reply, 2 voices, and was last updated 7 years, 9 months ago by .
Viewing 1 reply thread
Viewing 1 reply thread
- You must be logged in to reply to this topic.
› Forums › Automatic speech recognition › Dynamic Time Warping (DTW) › Spectral Analyses query
Hi.
I’m reading Holmes and Holmes chapter 8, where they assert that due to detailed spectral information not being available to humans at higher frequencies, the ‘effective filter bandwidth’ can be greater than the typical harmonic spacing.
I don’t understand the relationship between the harmonic spacing and ‘effective filter bandwidths.’ Are you able to explain how these relate?
Many thanks.
Matt.
Harmonic spacing: the interval (“distance, in frequency”) between two adjacent harmonics, in voiced speech. This interval will be equal to F0 since there is a harmonic at every integer multiple of F0.
Effective filter bandwidth: the cochlear can be thought of as a filterbank – a set of bandpass filters. The centre frequencies of the filters are not evenly spaced on a linear frequency scale. They get more widely spaced at higher frequencies. They also have a larger bandwidth (“width”) with higher frequency.
For the filters at higher frequencies, this bandwidth is greater than the harmonic spacing. Therefore, the cochlea cannot resolve the harmonics at higher frequencies. Rather, the cochlea only captures the overall spectral envelope.
These facts about human hearing are the inspiration for the Mel-scale triangular filterbank commonly used as part of the sequence of processes for extracting MFCCs from speech.
Some forums are only available if you are logged in. Searching will only return results from those forums if you log in.
Copyright © 2024 · Balance Child Theme on Genesis Framework · WordPress · Log in