Modify the do_alignment script

Optionally, you can modify the do_alignment script, which will affect the quality of the forced alignment.

You can modify the do_alignment script in order to experiment with the forced alignment. For example, if your alignment was already pretty good, you might modify the script to make the alignment worse, in order to examine what effect that has on the synthesis quality.

To change the script, you first need to make your own copy of it. Find out where the script is:

$ which do_alignment

which will show you the full path to the script. Now, make a copy of it, and put that in your alignment directory:

$ cd alignment
$ cp /Volumes/Network/...use path from above.../do_alignment .

and now you can edit this script in your favourite editor. To run your version, rather than the central version, you need to run it like this:

$ cd alignment
$ ./do_alignment .

in step Doing the alignment.

Training on a subset of data, but aligning the whole database

You need to make a version of train.scp files with only the list of MFCC files that you want to train the models on – let’s assume you’ve called that train_subset.scp. Now, you simply need to make sure that every execution of HCompV and HERest uses this smaller script file – for example:

HERest -C config -T 1023 -t 250.0 150.0 1000.0 -H hmm${i}/MMF -H hmm0/vFloors -I aligned.0.mlf -M hmm$[$i +1] -S train_subset.scp phone_list

Changing the number of mixture components

The default models have a mixture of 8 Gaussian components in the output probability density distributions. You can vary that number. HTK uses a method called “mixing up” to gradually increase the number of components. Here, we go from 1 to 2, then 3, 5 and finally 8 components. You can vary this by changing one line in the script:

# Increase mixtures.

for m in 2 3 5 8 ; do
...

Try using more or fewer components (possibly just one component) to see what effect this has.

Changing the vowel reductions

You don’t need to modify the do_alignment to achieve this – just edit the phone_substitutions file. Try removing all substitutions (i.e., make phone_substitutions an empty file).