Forum Replies Created
-
AuthorPosts
-
If there is no trained model, then check that you have actually run that stage:
TRAINDNN : True
Remember that you can modify (and improve!) the code. Here, it looks like you should add some debugging around the place where the error occurs. For example, print out the sox command just before executing it.
First, you cannot upgrade the copy in
/Volumes/Network/courses/ss/dnn/dnn_tts/
– that is not owned by you!You need to check out your own copy. If you still get the error, then do
$ svn upgrade
then
$ svn update
There is a working version of bandmat here
/Volumes/Network/courses/ss/dnn/dnn_tts/bandmat
Using screen to create a persistent remote session
Now let’s imagine that we want to run a program on the remote machine that takes a long time. It’s risky to do this directly on the command line, because if the ssh connection fails, then the program will terminate.
One option is to use nohup (find information on the web) but I strongly recommend using an different (and strangely underused) technique: screen.
screen gives you a persistent session on the remote machine. You can disconnect and reconnect to it whenever you like, and leave programs running in it. They will continue running after you disconnect.
To make screen easy to use, you need to set up a configuration file first. Log in to the remote machine and create a file called .screenrc in your home directory there. Note that this filename starts with a period. Here is an example of what to put in this file – the first line has a tricky character sequence – two backquotes inside double quotes:
escape "``"
and after than line, put this:
# define a bigger scrollback, default is 100 lines defscrollback 1024 hardstatus on hardstatus alwayslastline hardstatus string "%{.bW}%-w%{.rW}%n %t%{-}%+w %=%{..G} %H %{..Y} %d/%m %C%a " # some tabs, with titles - change this to whatever you like screen -t script -h 1000 1 screen -t log -h 1000 2 screen -t bash -h 1000 3
Once you’ve created that file (I’ve also attached an example – download and rename it), log out of the remote machine.
Now connect to the remote machine like this, which uses ssh to start screen:
$ ssh -t kairos.inf.ed.ac.uk /usr/bin/screen -D -R
you can navigate between the tabs using the key sequence ` (backquote) then either p or n. You have three separate shells running in this example.
If you actually need to type a backquote character, just press ` twice.
Try disconnecting – either just kill the Terminal that screen is running in, or use the key sequence ` (backquote) then d (for ‘detach’). To re-connect, just use the ssh command above. Your screen will come back just as you left it – magic!
Attachments:
You must be logged in to view attached files.June 6, 2016 at 14:44 in reply to: Synthesising directly from a phone sequence rather than text #3230SayPhones is probably only going to work for a diphone voice, not a Multisyn unit selection voice. Try loading a diphone voice and see if that works. You are going to get monotonic F0 though, I think.
Two things to do
1. ask for more quota (http://www.inf.ed.ac.uk/systems/support/form/ – mention my name)
2. make a directory in /disk/scratch on the local machine you are working on – this is NOT backed up, so should just be used for temporary working space
I’ve updated …/dnn_tts/configuration/configuration.py in the centrally installed version.
June 5, 2016 at 10:41 in reply to: Synthesising directly from a phone sequence rather than text #3224I’m not sure of the solution to this. Let’s talk in person – is Festival the best framework for you, or should we consider a DNN system?
You don’t need utterance structures for the very simple case that you are trying at this point (treat letters as phonemes, and use no other linguistic information). To build a voice, you simply need to figure out how to create the input features for training the DNN. You need to use the Prepare the input labels steps of the DNN voice building exercise as your starting point, but replace some steps with your own scripts.
For example, you do not need the step “Convert utterance structures to full context labels” – you need to create these full context labels using your own script (I suggest starting with a “full context” of triphones or quinphones).
The “Convert label files to numerical values” will be essentially the same, but you’ll need to modify the questions so that they correctly query your labels.
It’s well worth doing all of this with your own scripts (they are quite simple) because this will give you a deeper understanding of all the steps involved. Then, you could switch to the Ossian framework, which will automate some of this for you.
Yes – you are on the right lines – just assume that each letter is a phoneme.
A malloc (“memory allocation”) error of “can’t allocate region” suggests that you are running out of memory (RAM). Try reducing the minibatch size.
In sequence training, the minibatches need to be constructed from entire utterances, rather than randomised frames. So, the minibatch size will vary slightly, and not be constant. This may be why you only get this error seemingly randomly.
You can take a copy of
bandmat
from/Volumes/Network/courses/ss/dnn/dnn_tts/bandmat
, or add that to your PYTHONPATHCurrently libxml isn’t working on the lab machines. But it’s not needed – comment out all lxml (or modules from lxml) imports. These will be in
frontend/label_composer.py frontend/label_normalisation.py
Or, if you want to be more future-proof (you might need the libxml functions if you want to integrate with Ossian), wrap the imports in
try...except
such astry: from lxml import etree except: print "Failed to import etree from lxml"
or
try: import lxml from lxml import etree from lxml.etree import * MODULE_PARSER = etree.XMLParser() except: print "Failed to import lxml"
Stripping problematic punctuation from utts.data should be OK in your first build of this system. Come back and solve this problem later.
-
AuthorPosts