› Forums › Readings › Other readings › An introduction to signal processing for speech
- This topic has 3 replies, 4 voices, and was last updated 6 years, 3 months ago by Simon.
-
AuthorPosts
-
-
September 27, 2017 at 18:10 #7765
“A highly tuned resonance has a very sharp resonant peak as a function of frequency, and its resonant oscillations die away very slowly. A more
strongly “damped” resonance dissipates more energy each cycle, meaning that oscillations die away more rapidly…” Direct quote from the Sinusoid section.Ellis is discussing exponential decay in this section. What is meant by a sharp resonant peak exactly? Does it mean steeper sides reaching a sharp point? If so, wouldn’t this be symptomatic of a higher frequency? So to summarise the passage, a higher frequency leads to a slower rate of exponential decay.
If I’m completely off the money, I’d just like some clarification as I’m not sure exactly what Ellis meant in this passage.
Thanks!
-
September 27, 2017 at 23:19 #7766
Maybe Ellis is talking about the ‘resonance curve’s that we saw in Ladefoged. Describing how a resonator both responds to and filters an input.
Having a sharper peak means the frequency range it responds to best is narrower (narrower bandwidth) but I think this is actually independent of whether the frequency range itself is overall high or low frequency.
This screen grab from Ladefoged might clear things up
Specifying 3 different resonators with 3 different center frequencies and 3 different bandwidths. Sort of like 3 Gaussians with 3 different means and 3 different standard deviations, but non constant area (The area under probability density functions must sum to 1)
-
October 1, 2018 at 14:14 #9382
Quick question: on page 778 of this reading the caption for figure 20.12 writes: ‘the third panel shows the first 13 values of the DCT of each column of the Mel spectrogram…[continued]’.
I would like to confirm if this ‘DCT’ refers to the Discrete Cosine Transform. If so, do I need to know more about why and how this is applied to the Mel spectrogram? Also, how does DCT differ from the Fourier Transform (and its permutations) such that it is used with the Mel spectrogram over Fourier Transforms?
Thank you!
-
October 3, 2018 at 09:26 #9386
Yes, DCT means Discrete Cosine Transform. We will be coming on to that in the later part of Speech Processing, when we consider how to extract useful features from the FFT spectrum, to use for Automatic Speech Recognition. We’ll also bellowing at the Mel scale. Wait until we get there, then ask the question again.
-
-
AuthorPosts
- You must be logged in to reply to this topic.