› Forums › Foundations of speech › Signal processing › Ask questions about signal processing here!
- This topic has 6 replies, 3 voices, and was last updated 1 year, 3 months ago by Catherine Lai.
-
AuthorPosts
-
-
October 5, 2023 at 11:20 #16892
Hi all,
I’m just flagging this as the forum to ask questions about signal processing (i.e. Modules 3-4 in 2023-24). Feel free to ask questions and also to have a go at answering other people’s questions 🙂
cheers,
Catherine -
October 12, 2023 at 22:46 #16906
Hi Catherine,
Regarding the convolution theorem,
h(k) * x(n) = H(m) × X(m)I wonder which of H(m) and X(m) represents the spectral envelop, and which represents the harmonics?
Many thanks!
-
October 13, 2023 at 12:03 #16908
Hi Yujia,
The way we talked about it in class,
X(m) represents the magnitude spectrum of the input window x(n). Similarly, H(m) represents the magnitude spectrum of the filter h(k).So if x(n) represents the impulse train, X(m) will represent the harmonics.
Since h(k) represents the filter, H(m) will give us the shape of the spectral envelope.But it’s worth noting that the convolution theorem also applies more generally, x(n) and h(k) could represent other types of signals (rather than just source and filter). But we are focusing on source and filter in this class.
cheers,
Catherine -
October 16, 2023 at 22:23 #16959
Hi Catherine,
I’m looking at the fourier transform. It seeems that the first input point x[n] starts with n =0 and ends with n =N-1. The first analysis frequency e^(-(j*2pi*n*k/ N)) also starts with k =0 and ends with k = N-1.
Could you help me explain why do they starts with 0? Is it just an indexing issue?Besides, I believe the lowest analysis frequency is with k =1. When the input signal has a size of 16, sampling rate as 800Hz, the DFT[1] (with analysis rate as 50Hz) and DFT[15] (with analysis rate as 750Hz) are mirrored by DFT[8], where shall we put the DFT[0] (and maybe a potential DFT[16], with analysis rate as 800Hz)?
Thanks a lot.
Xueyan -
October 17, 2023 at 10:40 #16960
Hi Xueyan,
You’re right, the lowest analysis frequency is DFT[1] (i.e., k=1), but the actual DFT outputs start at index k=0. We skipped over a bit in class, but essentially the DFT[0] represents the vertical “bias” of the waveform. This tells you whether the waveform in the input window is symmetric around 0 (in amplitude) or whether it’s shifted up (or down on average.
This bias terms doesn’t actually tell us what frequencies are present in the input, so we usually don’t think too much about it. But, you can see a difference in DFT[0] if you compare a sine wave that matches one of the analysis frequencies as input (which has amplitude ranging from -1 to +1) and an input window containing one impulse and the rest zeros (so has amplitude either 0 and +1). In this case, DFT[0] for the sine wave will be zero, while DFT[0] for the impulse will 1.
In terms of the mirroring, DFT[0] would mirror DFT[16] where the number of samples in the input window is N=16. But the DFT formulation doesn’t include this (k=0,…,N-1). But, like DFT[0], DFT[N] wouldn’t tell us anything about the frequency components in the input.
cheers,
Catherine -
October 18, 2023 at 15:26 #16984
Hi Catherine,
Regarding this ‘vertical bias’ you mentioned in the previous thread, I just wonder how comes the impulse train’s amplitude does not drop below zero? I mean other waves we’ve saw usually go up and down, and by the time all energy has been consumed the amplitude stays at zero. Is it because impulse train contains energys at all frequency and they cancel each other’s energy, resulting in the positive-only amplitude?Many thanks.
Best
Yujia -
October 23, 2023 at 11:01 #16996
Hi Yujia,
Great question! In general, sound waves don’t have to be symmetric like sinusoids (e.g. sine waves). But what Fourier Analysis tells us is that we can approximate period waves that don’t look at all like sinsuoids, by adding together scaled and shifted sinusoids of different frequencies.
So, yes, if we want to make a spiky and non-symmetric waveform like and impulse train we actually need to use sinusoids of every frequency we have available to us (scaled and shifted to get rid of the negative amplitude bits and also all the curvy bits!).
An important thing to note here is that the impulse train of zeros and ones is really an idealised model of the source. Though we can use it to synthesize sppech sounds, the actual human vocal source is much more complicated. Actual glottal pulses don’t really look like a single spike, and they do produce negative pressure, but they still have a positive bias as they are not symmetric.
cheers,
Catherine
-
-
AuthorPosts
- You must be logged in to reply to this topic.