With only 3 short videos and 1 essential reading, you might be able to explore some of the recommended or extra readings this week. Feel free to pursue areas that interest you, and read either Jurafsky & Martin or Taylor – see which you prefer. Taylor is certainly the authority when it comes to Text-To-Speech, but on the other hand Jurafsky & Martin are experts in NLP.
Reading
Jurafsky & Martin (2nd ed) – Section 8.1 – Text Normalisation
We need to normalise the input text so that it contains a sequence of pronounceable words.
Taylor – Chapter 3 – The text-to-speech problem
Discusses the differences between spoken and written forms of language, and describes the structure of a typical TTS system.
Jurafsky & Martin – Chapter 5 – Part-of-Speech Tagging
For our purposes, only sections 5.1 to 5.5 are needed.
Jurafsky & Martin – Chapter 2 – Regular Expressions and Automata
An important technique used widely in NLP. In TTS, it can be applied to tasks such as detecting and expanding non-standard words.
Taylor – Chapter 4 – Text Processing
Complementary to Jurafsky & Martin, Section 8.1.
Taylor – Chapter 5 – Text decoding
Complementary to Jurafsky & Martin, Section 8.1.
Jurafsky & Martin – Section 3.4 – Finite-State Transducers
FST are a powerful and general-purpose mechanism for mapping ("transducing") an input string to an output string.
This is a SIGNALS tutorial to consolidate the material from Modules 1 and 2.
To prepare for the tutorial, go back over the Jupyter notebooks for those modules. Agree with your tutorial group a list of points that you would like to go over with your tutor. Carefully order your list (by topic and priority) and bring it to the tutorial.
Here are the key points that you need to understand from each notebook, so concentrate on these when writing your list:
- signals/slp-m1-1-sounds-signals – periodicity and pitch
- signals/slp-m1-2-digital-signals-complex-numbers – a phasor is a sinusoid with both magnitude and phase
- signals/slp-m1-3-sampling-sinusoids – sampling, Nyquist frequency, aliasing
- signals/slp-m1-4-discrete-fourier-transform – the DFT decomposes any signal into a series of basis functions; each basis function is a phasor (i.e., a sinusoid with magnitude and phase)
- signals/slp-m1-5-interpreting-the-dft – relating what you see in the time domain to what you see in the frequency domain
- signals/sp-m2-1-impulse-as-source – an impulse train has energy at every multiple of its fundamental frequency
- signals/sp-m2-2-fir-filters – FIR filters are little more than a moving average; an intuitive understanding that changing the filter coefficients changes the frequency response
- signals/sp-m2-3-iir-filters – IIR filters can exhibit resonance; the filter coefficients are not very intuitive; an IIR filter can impose a spectral envelope with resonances (formants) on its input signal; exciting an IIR filter with an impulse train can synthesise speech
Eventually, you may be able to understand a lot more of the material in the notebooks (so come back to them in a few weeks and try again), but the above is quite an achievement and is all you really need for the course.
This is a PHON tutorial about the phoneme. Go through the following Jupyter Notebook:
- phon/phon-m4-1-phoneme-tutorial
You might need to update your copy of the notebooks to get the latest version.
Exercises
Go through the following Jupyter Notebooks, which cover the topics Prosody, Decision Tree, and Learning Decision Trees:
- tts/tts-m4-1-entropy – do this one on your own then check your understanding by explaining entropy to another student in your group
- tts/tts-m4-2-decision-tree-pencil-and-paper – do this one in small groups of 2 or 3 students
- tts/tts-m4-3-learning-decision-trees – do this one on your own (but if the code is challenging for you, pair up with someone who can code)
You might need to update your copy of the notebooks to get the latest version. You may also need to install the following dependencies if you don’t already have them:
$ cd uoe_speech_processing_course $ conda activate slp $ conda install -c conda-forge anytree urllib3 requests
Practical assignment
Continue the first assignment. Complete all the milestones to date. Use the forums to get help.
Prepare for the tutorial session
In your small group of 2 or 3 students, prepare your workings for the tt/tts-m4-2-decision-tree-pencil-and-paper notebook and be ready to share them with the group. You might need to scan or take photos of pencil-and-paper work, so do that in advance of the tutorial. Prepare questions about the other notebooks with the whole tutorial group and make a structured list of questions to go through with the tutor.