Forum Replies Created
-
AuthorPosts
-
Sounds like you are using the Windows Subsystem for Linux, which is not a full Linux implementation. It’s fine for some things, but not for doing the Speech Processing course.
You need to use a proper Linux machine, such as the Virtual Machine.
The error is telling you that VMWare is not being allowed to display on your screen.
1. Always the first thing to try: reboot after installing VMWare, then try the play button again (because it installs low-level extensions to MacOS that will only be activated when the computer boots)
2. If you have VirtualBox installed (from previous testing that we asked you to do), uninstall it, then reboot (and possibly even re-install VMWare + reboot again, just to be sure!)
3. VMWare should have prompted you to agree to a kernel extension when you installed it, but this dialogue box doesn’t always appear. So, go to System Preferences (on your Mac) – Security & Privacy – General and look for something to click to agree to VMWare. As well as the General tab, look in the Privacy tab and find all the places where you can tick VMWare (e.g., Full Disk Access)
That’s correct – we solved Max W’s problem by completing the setup on the personal computer.
This is a new feature of the latest version of MacOS (Catalina onwards) – Apple have changed the default shell. But this is not an error message – just information.
The (base) in your shell prompt suggests that you haven’t activated the Python virtual environment, so try:
(base) $ conda activate slp
and you should see your prompt change from (base) to (slp). Now try
(slp) $ jupyter notebook
OK, that set-up should be fine. The VM image has not changed. Just complete all the testing in Module 0 and make sure it all works, then you are good to go.
Those don’t look like important errors, so if you’ve completed all tasks then you’re done.
Which operating system and host software are you using? We did some testing with VirtualBox but are now recommending VMWare.
I don’t think there is an electronic version of this book. It’s good and cheap enough to be worth buying, otherwise the main library has multiple copies.
Festival provides functions to change a variety of weights, including those within the join cost, but not those within the target cost (which are defined in code). See this post.
festival_mac
indicates the problem – the cause of this mysterious random change to the PATH is currently unknown, but there are some workarounds here.We are doing what is called “flat start” training, which means going directly from “flat” models (i.e., with all the means set to 0 and the variances set to 1) directly to the Baum-Welch algorithm with data of complete utterances. This means we do not need to label the start and end of either words or phones (in contrast to the digit recogniser exercise, where we did hand-label the training data).
HResults needs another file, listing the valid labels, so you need to do:
$ HResults -p -I ./reference.mlf wordlist rec/intel*.rec
where the file wordlist contains a list of all possible words that could be found in the transcriptions or rec files
I’ve also checked, and the dummy timestamps are not necessary: the rec files can just have one word per line.
Here’s one way to make the wordlist file, assuming rec files with one word per line and no timestamps:
$ cat reference.mlf rec/intel*.rec | egrep -v '#|"|\.' | sort -u > wordlist
Format of the reference MLF should be:
#!MLF!# "*/intel1.lab" word1 word2 word3 . “*/intel2.lab” word1 word2 .
with a final newline at the end of the file. The format of each rec file should also be one word per line (and you might need dummy start/end time?) – look at your rec files from the Speech Processing digit recogniser assignment.
Are you sure you have loaded your unit selection voice? Your output looks like that from a diphone voice, such as the default voice when you first start Festival.
For a unit selection voice, you should see something like this when inspecting the
Unit
relation:id _22 ; name #_h ; ph1 “[Val item]” ; sig “[Val wave]” ; coefs “[Val track]” ; middle_frame 10 ; source_utt arctic_a0379 ; source_ph1 “[Val item]” ; source_end 0.202 ; target_cost 0.0625 ; join_cost 0 ; end 0.136313 ; num_frames 14 ;
where source_utt tells you where the selected unit came from.
Taylor wrote that from his experience building two commercial systems, which were successors to Festival.
Festival doesn’t do anything to vary the components of the join cost, beyond the special case of one diphone being voiced at the join point and the other unvoiced (according to estimated F0).
The use of separate labels for closure and burst in plosives is only for forced alignment. It allows the join point to be placed reliably at the midpoint of the closure (the midpoint of the entire segment would sometimes be in the closure, sometimes in the burst, leading to synthesised plosives with 0, 1, or 2 bursts).
-
AuthorPosts