Introduction

An overview of the complete process and some tips for success.

In this practical exercise, you’re going to build a neural text-to-speech synthesiser using recordings of your own voice.

Before starting, be a proper engineer and

  • keep a logbook to record every single step

You’ll find this invaluable if you need to repeat any steps, and your notes will also be useful for writing up a lab report at the end.

To build your synthetic voice, you will follow step-by-step instructions and use a variety of existing tools. Currently, we only support the University of Edinburgh “Eddie” compute cluster, because some steps require GPUs.

Here are the main stages in this exercise:

  1. Get access to the necessary computing facility and set-up your environment
  2. Learn how to train the model, using some pre-existing data
  3. Create your own data
    • Select or design a recording script
    • Make the recordings in the studio
    • Prepare the data for training the model
  4. Train the model on your own data
  5. Evaluate the model(s) you have trained
  6. Write up.

Read all the way through the instructions before you start!

Related forums

    • Forum
    • Topics
    • Posts
    • Last Post