Time domain

Sound is a wave of pressure travelling through a medium, such as air. We can plot the variation in pressure against time to visualise the waveform.

slownormalfast

This video just has a plain transcript, not time-aligned to the videoSound is a wave and it has to travel in a medium.
Here the medium's air.
So there's air in this space.
Sound is a pressure wave, so a wave of pressure is going to travel through this medium.
Let's make a simple sound: a hand clap.
When we do that, our hands trap some air between them.
That compresses the air: the pressure increases.
Then it escapes as a pulse of higher pressure air.
We can draw a picture of that high pressure air propagating as a wave through the medium.
That red line is indicating a higher pressure region of air.
So, this is our first representation of sound, its propagation through physical space where sound travels at a constant speed.
In air, that speed is about 340 metres per second, which means it takes about 3 seconds to travel a kilometre.
But rather than diagrams like this - of sound waves propagating through space and then disappearing - it's much more informative to make a record of that sound.
We can do that by picking a single point in space and measuring the variation in pressure at that point over time.
We make that measurement with a device and that device is a microphone.
So let's use a microphone to measure the pressure variation at a single point in space and then plot that variation against time.
So a plot needs some axes.
Here, the horizontal axis will be time and the vertical axis will be the amplitude of the pressure variation.
It's very important to label the axes of any plot with both the quantity being measured and its units.
This axis is 'time', so we label it with that quantity: time.
Time has only one unit.
The scientific unit of time is the second and that's written with just 's'.
On the vertical axis, we're going to measure the amplitude of the variation in pressure.
So I've put the quantity 'amplitude' and 0 is the ambient pressure.
But we don't actually have any units on this axis.
That's simply because our microphone normally is not a calibrated scientific instrument.
It just measures the relative variation in pressure and converts that into an electrical signal that is proportional to the pressure variation.
So we just mark the 0 amplitude point but don't normally specify any units.
Now we can make the measurement of our sound.
As a sound wave passes the microphone, the pressure at that point rises to be higher than normal and then drops to be lower than normal, and eventually settles back to the ambient pressure of the surrounding air.
Let's plot the output of the microphone and listen to the signal the microphone is now recording.
We're going to take the output of this microphone and we're going to record this signal - this electrical signal - on this plot.
Here's the plot we just made.
The plot is called a waveform and this is our first actually useful representation of sound.
This representation is in the time domain because the horizontal axis of the plot is time.
Later, we'll discover other domains in which we can represent sound, and we'll plot those using different axes.
The waveform is useful for examining some properties of sound.
For example, here's a waveform of a bell sound.
We can see that, for example, the amplitude is clearly decaying over time.
This is a waveform of speech: 'voice'.
It's the word 'voice' and some of the things we can measure from this waveform would be, again that the amplitude is varying over time in some interesting way, and that this word has some duration.
We could enlarge the scale to see a little bit more detail.
This particular part of the waveform has something quite interesting going on.
It clearly has a repeating pattern; that looks like it's going to be be important to understand.
But in contrast, let's look at some other part of this waveform.
Maybe this part here.
It doesn't matter how much you zoom in here, you won't find any repeating pattern.
This is less structured: it's a bit more random.
That's also going to be important to understand.
So far, we've talked about directly plotting the output from a microphone.
Microphones essentially produce an electrical signal: a voltage.
That's an analogue signal: it's proportional to the pressure that's being measured.
But actually we're going to do all of our speech processing with a computer.
Computers can't store analogue signals: they are digital devices.
We're going to need to understand how to represent an analogue signal from a microphone as a digital signal that we can store and process in a computer.
We also already saw that sounds vary over time.
In fact, speech has to vary, because it's carrying a message.
So we'll need to analyse not whole utterances of speech, but just parts of the signal over short periods of time.
Speech varies over time for many reasons, and that's controlled by how it's produced.
So we need to look into speech production, and the first aspect of that that we need to understand is 'What is the original source of sound when we make speech?'