Lab report

Write up your findings from the exercise in a lab report, to show that you can make connections between theory and practice.

You should write a lab report about the two parts of the synthesis practical (“Step-by-step” and “Finding mistakes”). Keep it concise and to the point, but make sure you detail your findings using Festival. The lab report should have a clear structure and be divided into sections and subsections.

What exactly is meant by “lab report”?

It is not a discursive essay. It is not merely documentation of what you typed into Festival and what output you got. It is a factual report of what you did in the lab that demonstrates what you learnt and how you can relate that to the theory from lectures. You will get marks for:

  • completing all parts of the practical, and demonstrating this in the report
  • providing interesting errors made by Festival, with correct analysis of the type of error and of which module made the error
  • a clear demonstration that you understand the difference between human speech production and the methods used by Festival to generate speech. Why do the methods employed by Festival cause problems at times? What benefits could using a TTS system bring?
  • Describing the theory behind each module (what is it trying to achieve?), linking that to practice (what does it actually do?) and analysing errors in terms of the underlying techniques used in Festival. Feel free to be critical when necessary – Festival is not perfect and the voice we have given you to analyse is certainly not the best we could make with Festival! One way to show your critical thinking skills is to suggest how the mistake could be avoided: which part of Festival would need to be improved, and how?
  • clear and concise writing, and effective use of diagrams and examples. A good diagram will almost always allow you to write less text.

How much background material should there be?

Do not spend too long simply restating material from lectures or textbooks without telling the reader why you are doing this. Do provide enough background to demonstrate your understanding of the theory, and to support your explanations of how Festival works and your error analysis. If you make claims that are not drawn for the lecture material (e.g. specific phonetic/phonological phenomena or methods used in by Festival), use specific and carefully chosen citations. You can cite textbooks and papers. You do not need to cite research papers in this class, but you may get more marks for relevant use of citations that directly help you explain your analysis. Avoid citing lecture slides or videos. Make sure everything you cite is correctly listed in the bibliography.

You do not have to explain algorithms we have not gone over in this class (e.g. specific POS tagging methods) unless you specifically want to include an analysis/explanation of why they cause the specific errors you are analysing.

In the background section you should:

  • Briefly outline of human speech production: just enough to contrast with what Festival is doing
  • Describe what Festival should be doing in theory 
  • Describe what it does in practice

Have a look at structured marking scheme to get an idea of other parts of the TTS pipeline you should make sure you cover.

Writing style

The writing style of the report should be similar to that of a journal paper. Don’t list every command you typed! Say what you were testing and why, what your input to Festival was, what Festival did and what the output was. Use diagrams (e.g., to explain parts of the pipeline, or to illustrate a linguistic structures) and annotated waveform and spectrogram plots to illustrate your report. It may not be appropriate to use a waveform or spectrogram to illustrate a front-end mistake. Avoid using verbatim output copied from the Terminal, unless this is essential to the point you are making.

Additional tips

Give the exact text you asked Festival to synthesise, so that the reader/marker can reproduce the mistakes you find in Festival (this includes punctuation!). Always explain why each of the mistakes you find belongs in a particular category. For example, differentiate carefully between

  • part of speech prediction errors that cause the wrong entry in the dictionary to be retrieved
  • errors in letter-to-sound rules
  • waveform generation problems (e.g. an audible join)

Since the voice you are using in Festival is Scottish English, it is only fair to find errors for that variety of English, so take care with your spelling of specific input texts! You may find it helpful to listen to some actual speech from the source: Prof Alan Black. Quite conveniently for us, you can in fact listen to Alan Black talk about TTS to study his voice and the subject matter at the same time!