Synthesise!

It's time to generate synthetic speech from our trained model.

What you will learn in this part:

  • Performing inference to generate synthetic speech
  • How to listen to it

Inference

“Inference” is a fancy word for using our trained model to make a prediction: in this case, it predicts synthetic speech conditioned on the input text.

# Start an interactive GPU session with 1 GPU for 30 minutes
#  (jobs of less than 1 hour should get scheduled much quicker than longer jobs)
qlogin -q gpu -l gpu=1 -l h_rt=0:30:00 -P ppls_ssgpu

# You may need to wait for a compute note to be allocated by the scheduler
# If you experience a long waiting time, try again later

# Once your job is scheduled, you will obtain an interactive session on a GPU node

# On the GPU node, activate your environment
module load anaconda && conda activate py312torch27cuda118

# Set the paths
DATA_DIR=/exports/chss/eddie/ppls/groups/slpgpustorage/tts_cw
EXP_DIR=/exports/chss/eddie/ppls/groups/slpgpustorage/users/${USER}/tts_cw
TTS_PROJECT=your_project_name

# Go to the project directory
cd ${EXP_DIR}/${TTS_PROJECT}

# Perform inference for a single sentence
everyvoice synthesize from-text \
logs_and_checkpoints/FeaturePredictionExperiment/base/checkpoints/last.ckpt \
-v /exports/chss/eddie/ppls/groups/slpgpustorage/tts_cw/hifigan_universal_v1_everyvoice.ckpt \
-t "Hello World, is my first utterance." \
-a gpu -d 1 --output-type wav

# To save GPU resources, logout as soon as you are finished by type Ctrl+D, or:
logout

To perform inference for a list of sentences stored in a file called, for example test_sentences.txt:

everyvoice synthesize from-text \
logs_and_checkpoints/FeaturePredictionExperiment/base/checkpoints/last.ckpt \
-v /exports/chss/eddie/ppls/groups/slpgpustorage/tts_cw/hifigan_universal_v1_everyvoice.ckpt \
-f test_sentences.txt \
-a gpu -d 1 --output-type wav

# To save GPU resources, logout as soon as you are finished by typing Ctrl+D, or:
logout

Listening to the synthetic speech

VS Code is able to play audio files. Simply navigate to ${EXP_DIR}/${TTS_PROJECT}/synthesis_output. To copy the audio files to somewhere other than ECDF, you will need to use the filesystem skills you learned earlier.