Forum Replies Created
-
AuthorPosts
-
It’s already installed on the studio computer, as you’ll learn during your training session this week. (You can’t use your personal computer in the studio.)
Why are you trying to install this software on your own computer?
It looks like the remote machine (the Eddie login node) is refusing the request, which might be due to resource limits. If you have processes running on the remote machine, try quitting or killing them, then try again. This topic may be relevant.
This might be for the same reasons that prevent VS Code connecting – see this topic for things to try.
There are tight limits on how many resources a single user can use on each login node, and this error message suggests you have reached the limit. Running VS Code and tensorboard on the same node may exceed the limit.
This is usually because you already have processes running on that node. You need to find them and terminate them. In a terminal on your local computer,
sshto the login node in question, then try some of the following:# kill all VS Code processes killall -KILL nodeYou might need to repeat the above until you eventually see
node: no process found. Ignore any “Operation not permitted” messages – those relate to processes of other users (don’t worry – you cannot kill those!)# kill all Tensorboard processes killall -KILL tensorboardIf that doesn’t help, find all other processes belonging to you:
ps -elf | grep s1234567and kill any that you no longer need (of course, if you kill your login shell, you’ll simply be logged out without warning).
kill -KILL <insert process number here>When you say “HMMs need more data”, my first question would be “more than what?”
Read this reply and think about what train:test ratio is already built-in to the dataset.
Some small modifications to the
do_alignmentscript are all that is required.Locate all the steps in that script where models are trained, and modify them to use a smaller list of training files.
You’ll need to make that smaller list – for example you could create a file called
train_subset.scpin which you list only the files that you want the models to be trained on.Read this topic.
The
Qlabel is the glottal stop (usually written as?but mapped toQfor HTK purposes). It never occurs in the dictionary, but can be predicted by the letter-to-sound (LTS) model.Don’t add it to the phone set. Instead, find out which word is causing LTS to generate a pronunciation involving
?and then add an entry for that word tomy_lexicon.scmthat does not use?.You can ignore these warnings – it’s OK if some things in the cache cannot be deleted (e.g., because they are still in use by running programs such as the linux desktop manager).
It looks like you have large files elsewhere. Send me a direct email or Teams message with the output of the
ducommands in this post above.(Also, try to avoid screenshots – they are not searchable for others using the forums – it’s better to copy-paste the text into a post.)
You need to figure out what is taking up all the space, then delete some of it.
This post, and the thread below it, will help you learn how to do that.
You are telling
HRestto loadmodels/proto/$PROTOwhereas it should start with the models just created byHInit, although this won’t actually cause an error in HTK.Look for an error message from
HRest– if it is failing to create any models, it should report an error.Remember to always wipe all models (from
hmm0andhmm1) before every experiment: this will help you catch errors.You should use either a loop (around all the files to be recognised), or the
-Soption (to pass the name of a file, in which all files to be recognised are listed), but not both.The
-Soption will generally be faster(*). Why is that?(*) although you might not notice the difference on the lab computers, because network speed typically dominates the run-time.
Yes, there are a total of 39 elements in the feature vectors. 12 of them are the MFCCs. You have 12+13+13=38 though.
The language model computes P(W) where W is the word sequence of one utterance to be recognised. For the digit recogniser, can you locate and inspect the language model that is being used?
The acoustic model computes P(O|W) where O is the observation sequence. How does O relate to the MFCCs?
How are P(O|W) and P(W) combined to calculate P(W|O), and why do we need to do that?
You should also exercise extreme caution in uploading data to external AI tools, since they are likely to retain this data and potentially include it in the training data for a future update of the tool.
You do not have permission to share any of the data for this assignment outside the University.
-
AuthorPosts
This is the new version. Still under construction.