CUI 2024 slides available

The slides from my (Simon’s) keynote are now online under Courses > One-off events. I’ll try to add a recording and perhaps a bibliography later.

Spectrum and spectrogram

The spectrum and the spectrogram are much more useful ways of analysing speech signals than the waveform. We look at how to create them using Wavesurfer and what effect the analysis window size has on what we see.    

Continue reading...

Pipeline architecture for TTS

Pipeline architecture

Most text-to-speech systems split the problem into two main stages. The first stage is called the front end and contains many separate processes which gradually build up a linguistic specification from the input text. The second stage typically uses language-independent techniques (although they still require a language-specific speech corpus) to generate a waveform. Here we see those two […]

Continue reading...

Token passing

Token passing is a really nice way to understand (and even to implement) Viterbi search for Hidden Markov Models. Here we see token passing in action, and you can look at the spreadsheet to see the calculations. To keep things simple, we are ignoring transition probabilities in this example. It would be simple to add them […]

Continue reading...

Bitrate

The bitrate (or bit rate) of a signal is the number of bits required to store, or transmit, 1 s of that signal. A bit is a binary number: either 0 or 1. Let’s calculate the bitrate of a digital waveform. First you should revise the concepts of sampling and quantisation from this module of the […]

Continue reading...

The Gaussian probability density function: understanding the equation

The equation for the Gaussian probability density function looks a little scary at first, but this video should help you understand what each of the terms is doing, and how they fit together. After watching the video download the spreadsheet which shows the calculations and plots from this video (tip: the Apple Numbers.app version includes images […]

Continue reading...

Entropy: understanding the equation

The equation for entropy is very often presented in textbooks without much explanation, other than to say it has the desired properties. Here, I attempt an informal derivation of the equation starting from uniform probability distributions. A good way to think about information is in terms of sending messages. In the video, we send messages […]

Continue reading...

Wave propagation on the surface of water

At the Alhambra (Granada, Spain) I saw this nice example of waves from a point source propagating in all directions at a fixed speed.

Continue reading...

Classification and regression trees (CART)

A quick introduction to a very simple but widely-applicable model that can perform classification (predicting a discrete label) or regression (predicting a continuous value). The tree is learned from labelled data, using supervised learning. Before watching this video, you might want to check that you understand what Entropy is.

Continue reading...

A simple synthetic vowel

Using Praat, we synthesise a simple vowel-like sound, starting with a pulse train, which we pass through a filter with resonant peaks.

Continue reading...

Windowing

When we say that a signal is non-stationary we mean that its properties, such as the spectrum, change over time. To analyse signals like this, we need to first assume that these properties do not change over some short period of time, called the frame. We can then analyse individual frames of the signal, one at a […]

Continue reading...

Aliasing

Aliasing

In sampling and quantisation we saw that sampling a signal at a fixed rate means that there is an upper limit on the frequencies that can be represented. This limit is called the Nyquist frequency. Before sampling a signal, we must remove all energy above the Nyquist frequency, and here we will see what would […]

Continue reading...