Resonant tube

The understand how the vocal tract modifies sound, we need to start with the concept of resonance.

slownormalfast

This video just has a plain transcript, not time-aligned to the videoInspecting speech signals in the frequency domain revealed a really important property: the spectral envelope.
So we're now going to develop an explanation of where and how the spectral envelope is created.
To do that, we need to start with an understanding of how sound - for example, created by the vocal folds - behaves inside the vocal tract and how the vocal tract modifies that basic sound source by acting as a resonator.
Here's our vocal tract and, as usual, we'll simplify it.
We'll make it a tube; in fact, just a straight tube.
Maybe now's a good time to explain why we're always simplifying things.
First and foremost, it's so that we can focus on understanding essentials without being distracted by less important details.
In this case, the curve of the vocal tract is an unimportant detail.
It doesn't matter for sound propagation, so we'll assume the tube behaves like a straight tube.
But this simplification isn't just for us (for our learning).
We're also progressing towards a computational model of speech: in fact, of speech signals.
That model will actually be useful in real engineering applications, such as speech modification for Speech Synthesis or feature extraction for Automatic Speech Recognition.
This simplification is a necessary step to make it possible to build that model.
So the vocal tract is a tube.
I've assumed it's straight.
For now, let's assume that it has a uniform cross section.
Obviously, that's not always true: we can move our your tongue and lips to vary the tube shape.
But our first goal here is just to understand what resonance is, and that this tube is a resonator.
Once we've got that understanding, we'll be able to extend it later to more interesting tube shapes.
But for the purpose of understanding, it makes most sense to use the simplest tube, and that's this one here.
In fact, let's keep simplifying.
I'm going to assume for a moment that the tube is closed at both ends.
That will make resonance easier to understand.
Let's introduce a source of sound at one end of this tube: an impulse.
That's simulating one cycle of the vocal folds.
That sound propagates down the length of the tube.
When that sound wave reaches the end of the tube, it will be reflected and bounce back.
So the sound wave will bounce backwards and forwards end-to-end up the tube forever.
This is a perfect tube (it's just a model): there's no losses; energy is conserved.
You'll notice that the sound wave is now drawn as a vertical line.
The length of the tube is much greater than its width, and so we'll only consider the length as the most important dimension.
In other words, it's a 1-dimensional simulation, and so we only need to consider the sound wave propagating in that dimension.
This sound wave bouncing endlessly up and down the tube is a standing wave.
Let's make a measurement of that sound inside the tube.
Let's measure the sound at this point X and calculate the frequency of that sound.
In other words, how many times per second does the sound pressure make a complete cycle at point X?
You'll need a bit more information to calculate that.
I'm going to tell you the length of the tube.
It's 0.175 m - in other words, 17.5 cm.
That's comparable to an average vocal tract.
I'm also going to tell you how fast the sound wave travels.
It travels at the speed of sound, which is a constant.
We'll make that a nice round number of 350 m s^-1.
So for the sound to make a complete cycle at point X, it has to travel twice the length of the tube.
It passes through point X now, and it has to make one length of the tube, a second length of the tube, and then it goes through point X again.
That's one complete cycle.
So you know the length of the tube.
You know the speed of sound.
Work out how many times per second we get cycles at point X.
As a hint, I suggest you work out the period and from that compute the frequency.
Pause the video.
The wave has to travel a round trip of 0.35 m and at this speed of sound that's going to take 1/1000 of a second.
That's the fundamental period; let's just call that T.
A sound with that fundamental period will have frequency of 1 over that, so the frequency is going to be 1000 Hz.
So simply by introducing one impulse into this tube, and allowing that sound wave to bounce backwards and forwards end-to-end along the tube, we have a standing wave at a frequency of 1000 Hz.
That is resonance
It takes the sound wave a fixed amount of time to go to one end of the tube and back again, and that amount of time is determined by two things.
Obviously, the first of those is the speed of sound, and that's a constant.
The other is the length of the tube.
What if, as this pulse travels up the tube, and is reflected at this end, and arrives back at its source, I add another little pulse of sound?
That's going to add to the air pressure at this point.
I'll draw that by making the line a bit thicker.
That new pulse travels down the tube, overlaid on top of the pulse that's just been reflected.
Now we have a pulse with a greater pressure difference to the ambient pressure.
In other words, a higher-amplitude sound wave.
If I kept doing that, if I kept adding tiny amounts of energy to this system at just the right moment in time, I can obtain larger and larger amplitude sound waves.
Let's talk about that in terms of frequency.
If I kept adding energy to this system at the right frequency, I can increase the amplitude of this sound wave.
That's the power of resonance.
Very small amounts of input energy at the right frequency - at the resonant frequency of the system - can result in a very large output from the system.
Now we did that here with a tube that's closed at both ends.
We'll find out later that the same principle applies even if the tube is open at one end, because that will still reflect the sound wave.
Any tube is a resonator and will have a resonant frequency related to its length.
Lots of physical systems exhibit resonance.
Resonance means that you can obtain a large response at a particular frequency if you input energy at that frequency.
This swing has exactly that property.
You could make this swing move a lot by pushing gently with just your little finger, but you must do it at the right moment in the cycle.
In other words, you must put in energy at the swing's resonant frequency.
If you try pushing this swing at a different frequency, you won't get such a large response.
You get a large output if you put the input at the right frequency, but the resonator attenuates other frequencies.
If you put energy in at the wrong frequency, you get almost no output.
Have a look around you today and try to find other resonant objects.
You'll find many, many of them.
Tubes are resonators.
They have a frequency at which they will resonate.
The vocal tract is a tube, so it must have that property.
But of course, it's not just a simple tube.
A speaker can vary the shape of their vocal tract, and that's going to vary its resonant frequency (or frequencies) depending on the shape of the tube.
Those resonant frequencies are used by speakers to carry linguistic messages.
So they've been given a special name in linguistics: they're called formants.
That's what we need to understand next.
After we've understood that, we'll immediately generalise that idea and think about the vocal tract not just as a tube with resonances but as a general filter: something which takes an input and has a variable response depending on the frequency of the input.
The input will be the basic sound generated either by the vocal folds or by frication, and the filter will be the vocal tract.