Vowel Space

Variability in the acoustic vowel space as well as the relationship of it to inventories of contrastive vowel sounds in languages.

This video just has a plain transcript, not time-aligned to the videoIn previous videos we have seen that the first and second formants of the vocal tract are related to our perceptions of vowel quality, and the source-filter model provides us a way to understand the origins of these formants and why they change with the changing shape of the vocal tract. In this video, we will consider variability in the acoustic vowel space as well as the relationship of it to inventories of contrastive vowel sounds in languages.
We’ve already seen that plotting the first and second formants against one another results in a figure that looks quite similar to the vowel quadrilateral of the IPA, though the correspondence is not perfect.
In the plot on the right, formants from one token of each vowel quality were used, giving the impression of a clean acoustic space with neatly delimited vowel categories. Unfortunately, natural data from spontaneous speech and multiple speakers is never this straightforward.
The data shown here are taken from the classic study of American English vowels by Peterson and Barney in 1952. In this case, the axes are oriented in a more mathematically plausible way, with the x-axis representing F1, and the y-axis representing F2, and the origin in the lower left.
If we think of the IPA vowel chart as aligning with a speaker who is facing to our left,
then the present plot is analogous to a speaker who is lying on their back with their right side facing us.
This F1-F2 plot shows data from 76 speakers producing ten different vowels. As we can see, there is considerable variability in the productions of these vowels, and considerable overlap between a number of vowel categories. Even when the vowel productions do not overlap, they are very near to one another in the acoustic space, making categorization difficult.
These vowels were then presented to listeners who were asked to report which vowel was being spoken. Although most vowels were identified correctly most of the time, there was a considerable number of vowels that were frequently misidentified in the data as well. When the data was re-plotted including only vowel tokens that were correctly identified 100% of the time, the overlap between the vowel categories in the acoustic space was reduced.
The question remains about how speakers and listeners are able to communicate with such a high degree of variation in the acoustic transmission. In fact, variation in the vowel space has been found for different speakers depending on age, sex, speech rate, speech style, dialect, and of course different languages also have diverse vowel spaces when compared to one another.
The question remains about how speakers and listeners are able to communicate with such a high degree of variation in the acoustic transmission. In fact, variation in the vowel space has been found for different speakers depending on age, sex, speech rate, speech style, dialect, and of course different languages also have diverse vowel spaces when compared to one another.
Here is an example of differences between male and female vowel spaces In another study of American English vowels. Here we see that the vowel space of women is larger than that of men. The black squares indicate the vowel space created from the mean formant values. White circles indicate the smallest value for each vowel, while the gray squares indicate the largest values. There are a variety of reasons for why this difference between men and women may exist, from physical characteristics to socialization and gender roles, but the whatever the source of this variation, the problem of how to equate these very different acoustic patterns to one another within the same perception or speech recognition system remains.
Despite the variability within the vowel space, the human vocal tract is still limited by its physical properties. It’s no wonder that vowel productions sometimes overlap with one another inside this space. However, we might expect that languages might have smaller vowel inventories as a result of this limitation in order to maximize the perceptual difference between vowel categories. Indeed, many languages of the world have small vowel inventories of only 3 or 5 vowel qualities. In these cases, the vowel space tends to resemble an inverted triangle, with high front and back vowels and a low central or front vowel. As more vowel contrasts are added into the acoustic space, the vowel qualities crowd and push each other around to maintain their perceptual distance – much the way people will arrange themselves to be evenly spaced within a crowded elevator.
Given the limited acoustic space which we have to work with inside our vocal tracts, vowel inventories in languages need to balance pressures leading to ease of perception, that is, maintaining vowel distinctions that are as far as possible from one another, with ease of production. This puts a limit on the number of vowel contrasts one language can make use of. With more contrasts to be identified, the level of perceptual confusion rises, which may lead to phenomena like mergers where contrasts between sounds are lost.

Log in if you want to mark this as completed
Excellent 61
Very helpful 12
Quite helpful 15
Slightly helpful 6
Confusing 2
No rating 0
My brain hurts 1
Really quite difficult 1
Getting harder 4
Just right 83
Pretty simple 7
No rating 0