Suggested experiments

These example experiments are just to get you thinking. You should devise interesting experiments of your own.

To get started, use the already-trained models from the speaker-dependent experiment.

If you use models trained only on your own speech to recognise the Test set of another speaker, what Word Error Rate do you expect?

That will give you some clues, but isn’t a very sophisticated experiment: no real-world system would attempt to do that. Your main experiments should investigate speaker-independent systems trained and tested on larger numbers of speakers.

The testing speakers must be distinct from the training speakers. You need to control all factors that are not of interest: (including: accent, gender, microphone type, amount of training data). Here are some possible experiment designs:

  1. The effect of gender, with simplistic control over accent and microphone
    • Training set: the training data of 20 female UK English speakers using headset microphones
    • Test set A: the test data of 20 female UK English speakers not in the training set, also using headset microphones
    • Test set B: the test data of 20 male UK English speakers (obviously not in the training set), also using headset microphones
  2. The effect of gender, with more sophisticated control over accent and microphone (version 1)
    • Training set A: the training data of 50 female speakers, with a mixture of accents and microphones
    • Training set B: the training data of 50 male speakers, with a mixture of accents and microphones in the same proportions as training set A
    • Test set: the test data of 50 female speakers not in training set A, with a mixture of accents and microphones in the same proportions as training set A
  3. The effect of gender, with more sophisticated control over accent and microphone (version 2)
    • Training set: the training data of 50 female speakers with a mixture of accents and microphones
    • Test set A: the test data of 50 female speakers not in the training set, with a mixture of accents and microphones in the same proportions as  the training set
    • Test set B: the test data of 50 male speakers, with a mixture of accents and microphones in the same proportions as the training set

Some of these designs are better than others. Can you work out the pros and cons of each design?

What effect does microphone type have?

Design an experiment to discover whether the microphone type is important. This might involve discovering if some microphones give lower Word Error Rate than others, or finding out the effect of mismatches between the Training and Test sets. Remember to control all the other factors.

You can perform equivalent experiments to investigate the gender and accent factors too.

What effect does the amount of training data have?

In machine learning, it’s often said that more training data is better. But is that always the case? Design some experiments to explore this. Include cases where the Training and Test sets are well-matched (e.g., in gender and/or accent, etc) and cases where there is mismatch. What is more important: matched training data, or just more data?

These questions are very important in commercial systems: it costs a lot of money to obtain the training data, so we want to collect the most useful data we can.