An introduction to signal processing for speech

This topic has 3 replies, 4 voices, and was last updated 6 years, 8 months ago by Simon.

Viewing 3 reply threads

Author

Posts
- September 27, 2017 at 18:10 #7765
  Mark L
  Student
  “A highly tuned resonance has a very sharp resonant peak as a function of frequency, and its resonant oscillations die away very slowly. A more
  strongly “damped” resonance dissipates more energy each cycle, meaning that oscillations die away more rapidly…” Direct quote from the Sinusoid section.
  
  Ellis is discussing exponential decay in this section. What is meant by a sharp resonant peak exactly? Does it mean steeper sides reaching a sharp point? If so, wouldn’t this be symptomatic of a higher frequency? So to summarise the passage, a higher frequency leads to a slower rate of exponential decay.
  
  If I’m completely off the money, I’d just like some clarification as I’m not sure exactly what Ellis meant in this passage.
  
  Thanks!
- September 27, 2017 at 23:19 #7766
  Jason F
  Tutor
  Maybe Ellis is talking about the ‘resonance curve’s that we saw in Ladefoged. Describing how a resonator both responds to and filters an input.
  
  Having a sharper peak means the frequency range it responds to best is narrower (narrower bandwidth) but I think this is actually independent of whether the frequency range itself is overall high or low frequency.
  
  This screen grab from Ladefoged might clear things up
  
  Specifying 3 different resonators with 3 different center frequencies and 3 different bandwidths. Sort of like 3 Gaussians with 3 different means and 3 different standard deviations, but non constant area (The area under probability density functions must sum to 1)
- October 1, 2018 at 14:14 #9382
  Danielle O
  Student
  Quick question: on page 778 of this reading the caption for figure 20.12 writes: ‘the third panel shows the first 13 values of the DCT of each column of the Mel spectrogram…[continued]’.
  
  I would like to confirm if this ‘DCT’ refers to the Discrete Cosine Transform. If so, do I need to know more about why and how this is applied to the Mel spectrogram? Also, how does DCT differ from the Fourier Transform (and its permutations) such that it is used with the Mel spectrogram over Fourier Transforms?
  
  Thank you!
- October 3, 2018 at 09:26 #9386
  Simon
  Professor
  Yes, DCT means Discrete Cosine Transform. We will be coming on to that in the later part of Speech Processing, when we consider how to extract useful features from the FFT spectrum, to use for Automatic Speech Recognition. We’ll also bellowing at the Mel scale. Wait until we get there, then ask the question again.
Author

Posts

Viewing 3 reply threads

You must be logged in to reply to this topic.

An introduction to signal processing for speech

Search the forums

Note

Latest Activity

Search the forums

Speech Synthesis