One component of the join cost is the fundamental frequency, F0. This is extracted separately from the pitch marks, although the two things are obviously closely related.
Whereas the pitch marks are required by the signal processing used for waveform generation, pitch contours (or more correctly, F0 contours) are required by the join cost.
bash$ mkdir f0 bash$ make_f0 -[mf] wav/*.wav
The -f
and -m
flags control the pitch tracker settings. You can look in the make_f0
if you want to see what they are. Choose -f
if your voice has a pitch range typical of female speakers, or -m
flag if your pitch range is typical of male speakers. Optionally (or come back to this step later), you could even make your own copy of the make_f0
script and directly modify the various pitch tracker settings to match your own pitch range (talk to a tutor in the lab first to be sure you understand exactly what they are, what units they are in, and so on).