It's time to generate synthetic speech from our trained model.
What you will learn in this part:
- Performing inference to generate synthetic speech
- How to listen to it
Inference
“Inference” is a fancy word for using our trained model to make a prediction: in this case, it predicts synthetic speech conditioned on the input text.
# Start an interactive GPU session with 1 GPU for 30 minutes
# (jobs of less than 1 hour should get scheduled much quicker than longer jobs)
qlogin -q gpu -l gpu=1 -l h_rt=0:30:00 -P ppls_ssgpu
# You may need to wait for a compute note to be allocated by the scheduler
# If you experience a long waiting time, try again later
# Once your job is scheduled, you will obtain an interactive session on a GPU node
# On the GPU node, activate your environment
module load anaconda && conda activate py312torch27cuda118
# Set the paths
DATA_DIR=/exports/chss/eddie/ppls/groups/slpgpustorage/tts_cw
EXP_DIR=/exports/chss/eddie/ppls/groups/slpgpustorage/users/${USER}/tts_cw
TTS_PROJECT=your_project_name
# Go to the project directory
cd ${EXP_DIR}/${TTS_PROJECT}
# Perform inference for a single sentence
everyvoice synthesize from-text \
logs_and_checkpoints/FeaturePredictionExperiment/base/checkpoints/last.ckpt \
-v /exports/chss/eddie/ppls/groups/slpgpustorage/tts_cw/hifigan_universal_v1_everyvoice.ckpt \
-t "Hello World, is my first utterance." \
-a gpu -d 1 --output-type wav
# To save GPU resources, logout as soon as you are finished by type Ctrl+D, or:
logout
To perform inference for a list of sentences stored in a file called, for example test_sentences.txt:
everyvoice synthesize from-text \ logs_and_checkpoints/FeaturePredictionExperiment/base/checkpoints/last.ckpt \ -v /exports/chss/eddie/ppls/groups/slpgpustorage/tts_cw/hifigan_universal_v1_everyvoice.ckpt \ -f test_sentences.txt \ -a gpu -d 1 --output-type wav # To save GPU resources, logout as soon as you are finished by typing Ctrl+D, or: logout
Listening to the synthetic speech
VS Code is able to play audio files. Simply navigate to ${EXP_DIR}/${TTS_PROJECT}/synthesis_output. To copy the audio files to somewhere other than ECDF, you will need to use the filesystem skills you learned earlier.