Lab report: Structure and Tips

Write up your findings from the exercise in a lab report, to show that you can make connections between theory and practice. This page describes the required lab report structure.

You should write a lab report about this speech recognition practical. Keep it concise and to the point, but make sure explain what the digit recogniser is in theory and how it is implemented and being used in practice. You should also report your experimental work, clearly explaining your experimental design, your results, and link your results to your research questions and hypotheses through your conclusions.

Report Structure

Title/Author/Word count info

Make sure you include a title, your  exam number, and your word count at the beginning of your report.

  • Choose an informative title that tells the reader a bit about what the report is about. For example, “SP Assignment 2” is not very informative.
  • State your exam number (on your student card, starts with a B)
  • State the word count (see formatting guidance)

The following gives an outline of the sections and basic information you need to include. We are, as always, following the University of Edinburgh common marking scheme. So, to get highest marks, start with the basics and  try to add some extra analysis, contextualisation or critique beyond the basic information.

1 Introduction

[5 marks]

This section should give an overview of your report.

  • You should briefly introduce:
    • The task you are focused on (i.e.,  digit recognition as a whole word ASR problem) and the goals of your study
    • The motivation for the specific experiments that you focus on later in the report
  • You can also highlight any key findings you made and/or implications you have drawn from your experiments

2 Data

[5 marks]

Describe the dataset you use in your experiments:

  • Who were the speakers?
  • When did the record the data?
  • How did they record the data? What were the recording conditions?
  • What other metadata is available about the recordings?

3 Method

In this section, you should describe the digit recogniser itself following the subsections headings listed below. You should explain what its components are, how it is trained, how it is used for speech recognition and how you will evaluate its performance. You should contrast how the digit recogniser relates to ASR theory and relate this to how this specific assignment setup does ASR in practice.

3.1 Feature Extraction

[10 marks]

This section relates to Module 8: Speech Recognition – Feature Engineering.

  • What are MFCCs? What do they capture?
  • Why are MFCCs used in this digit recogniser setup?
  • How were the MFCCs extracted?

3.2 Training

[10 marks]

This section relates to Modules 9: Speech Recognition – the Hidden Markov Model, and Module 10: Speech Recognition – Connected Speech and HMM training.

  • How are HMMs used in the digit recogniser?
  • How are the HMMs trained?
    • What parameters are “learned” during training?
    • What methods are used for training?
      • There are two scripts used for training the digit recogniser. What do they do differently?
      • How and when are uniform segmentation, Viterbi training, and the Baum-Welch algorithm used?
      • What’s the relationship between the Viterbi algorithm and Viterbi training?
    • Why do we have several training steps?
    • What do each of the steps do?
    • How does each step make use of the extracted features?

3.3 Recognition

[10 marks] 

This section relates to Modules 9: Speech Recognition – the Hidden Markov Model.

  • What methods/components are used for digit recognition?
    • How is the Viterbi algorithm related to recognition?
    • What is token passing and how is it used?
  • What’s the role of the acoustic model and how does this relate to HMM training for this specific digit recogniser setup?
  • What’s the role of the language model? What is the specific language model used for the assignment digit recogniser? What’s the difference in language model for the single digit recogniser and the digit sequence recogniser.
  • How are the acoustic model and language model combined to perform recognition?

3.4 Evaluation

[5 marks]

This section is related to Module 10 (see readings).

  • What metric do we use to evaluate how well a specific digit recogniser setup did? How is it calculated?
  • You can also  note another other consideration around evaluation of the digit recogniser in this section.
  • You aren’t required to do statistical tests for this assignment, but if you have experience in this area you can apply them in this assignment if you want to and if it makes your argument stronger (you won’t get marks for simply applying a lot of tests without reasoning). If you do use statistical tests, explain what you are using and why.

4 Experiments

Having established the data you will use and the digit recogniser (as method in theory and practice), it’s now time to write up your specific experiments.

Use your time in the labs to discuss your experiments and troubleshoot your scripts with the tutors.

You should use a separate subsection to describe each of your main experiments (these may consist of further sub-experiments addressing the section’s research question/hypothesis).

Your mark will reflect your explanation of the research/questions, experimental design, presentation of results, and conclusions across all your experiments. So within each subsection, you should make sure the following are clear:

  • [10 marks] Research questions and hypotheses
    • What are you trying to find out from your experiment?
    • What hypotheses can you make based on your knowledge of speech and ASR?
    • You should explain what evidence you have for your hypotheses. This can come from citations, but also reasoning based on what we have covered in class. Use citations and references where you can to strengthen your argument.
  • [10 marks] Experimental design
    • Explain your experiment setup and how it relates to your hypotheses
    • What specific data did you use?
    • If you alter the model, what did you change?
    • In general, aim for reproducibility so that another student could implement the same experiment. You don’t have to include detailed speaker lists, but do describe how you selected speakers for training and testing sets
  • [10 marks] Results
    • Present your results clearly with tables and/or graphs
      • Aim for readability with tables and figures. A reader should be able to get the idea of whether the results support your hypothesis from a quick glance. That means using appropriate captions, labelling axes and giving table rows and columns informative names.
  • [10 marks] Conclusions
    • Discuss whether your results support your hypothesis (or not)
    • Discuss limitations of your experiments and how certain you are of your findings.
      • Do you think your results would generalise to other speakers/recordings with the same characteristics? How about speakers/recordings with some different characteristics?
      • Are there other experiments you could do (if you had more time) that would help understand whether the results would generalise?
    • Extra: if you can connect your results to other findings in the literature: what does your experiment contribute to the bigger picture on ASR performance?

5 Discussion and Overall Conclusion

[5 marks]

In this section you should briefly summarise and discuss your findings across your experiments and their implications: what do you know now about ASR that you didn’t know before?

  • What were your main findings?
  • Discuss the implications of your findings: Can you make any connections between the experiments in terms of what factors affect the digit recogniser performance the most?
  • Are there any general conclusions you can make about the digit recogniser setup? If not, could you modify your experiments so that you could make more general conclusions?

Range of experiments

[10 marks]

In addition to the report sections listed above, you will also receive some marks for the range of experiments that you write up (you don’t need to write a specific section for this, it should be evidenced from section 4). At a minimum you should report on at least 2 experiments to pass.  Historically, most people can do well writing up 3-4 experiments.

You will get more marks for exploring more aspects of the digit recogniser, but you should also ensure your experiments actually help you get stronger conclusions about your research questions/hypotheses. So, it can be beneficial to design follow-up experiments that shed more light on a specific research question. This helps us to see your depth of understanding.

In general, exploration of 3 research questions done in depth with well-considered experimental designs (possibly reporting on sub-experiments) will get you a better mark than many unconnected experiments with simplistic experimental designs.

You don’t necessarily have to design your experiments such they all relate to one another (i.e., build to towards one big research question for the whole report). But if you can do this, and so support any overall conclusions you make, that would be seen as a good thing!

More writing advice

What exactly is meant by “lab report”?

It is not a discursive essay. It is also not merely documentation of commands that you ran and what output you got. It is a factual report of what you did in the lab that demonstrates what you learned and how you can relate that to the theory from lectures. You will get marks for:

  • completing all parts of the practical, and demonstrating this in the report
  • a clear demonstration that you understand what each part of the digit recogniser does, how this relates to ASR theory and how this relates to the HTK tools we use in practice
  • clear and concise writing, and effective use of diagrams, tables and graphs

How much background material should there be?

Do not spend too long simply restating material from lectures or textbooks without telling the reader why you are doing this.

Do provide enough background to demonstrate your understanding of the theory, and to support your explanations of how HTK works and your experiments. Use specific and carefully chosen citations. Cite textbooks and papers. You will get more marks if you cite better and more varied sources (e.g., going beyond the essential course readings). If you only cite from the main textbook, this will not get you top marks. Avoid citing lecture slides or videos, unless you really cannot find any other source (which is unlikely). Make sure everything you cite is correctly listed in the bibliography.

Writing style

The writing style of the report should be similar to that of a journal paper. Don’t list every command you typed! You do not need to include your shell scripts. Use diagrams to illustrate your report, and tables and/or graphs to summarise your results. Do not include any verbatim output copied from the Terminal: you will not receive any marks for this.

We won’t take into consideration any appendices.

Other tips

You do not need to list the individual speakers in each of your data sets, but do carefully describe the data (e.g., “20 male native speakers of Scottish English using laptop microphones”). You might use tables to present this in a compact form, and perhaps gives short names or acronyms to each set, such as “20-male-SC-laptop”.