Use bottleneck features to train Ossian acoustic model

This topic has 4 replies, 3 voices, and was last updated 8 years, 8 months ago by Srikanth R.

Viewing 4 reply threads

Author

Posts
- June 17, 2016 at 14:27 #3271
  Giorgia M
  Student
  Using Ossian, I would like to train the acoustic model using bottleneck features. Is this possible?
  
  In the toolkit I have found the file “run_dnn_bottleneck.py” but it imports the architecture from a model called “sdae.py”, which is not in the models folder. Could this be made available?
  
  Thanks
- June 18, 2016 at 19:06 #3274
  Srikanth R
  Student
  If you want to run the bottleneck features, you can do it without that import. So, please do comment all those imports.
  
  The bottleneck system has to be trained first and then use the appended_input_dim as bottleneck dimension for the second DNN.
- June 19, 2016 at 14:23 #3277
  Norbert G
  Student
  Training run_dnn_bottleneck.py we encountered several problems which we managed to solve and run the training. However, the process remaining a little bit vague, hece a few questions and comments that we would like to address.
  
  First, we commented out the import of the following modules: ms_dnn, ms_dnn_gv, sdae, as we are going to use just DNN model.
  Second, we had to change the tpye of learning rate to numpy.asarray with type float 32. Could you please explain why it is necessary?
  Third, we had to uncomment lines 574, 576, 577. Those lines are responsible for creating label files in the dnn_streams/binary_label_number and dnn_streams/nn_no_silence_lab_number which are essential for the bottleneck training. We would like to know what nn_no_silence labels are and why are they required for the bottleneck training while they are not required for DNN training?
  Fourth, while training we get a WARNING: no silence found! Has it anything to do with question 3?
  Fifth, when specifying the size of hidden layers in config file, we have changed the size of the 5th of the 6 hidden layers to 512. The other 5 remain 1024. Is there any rule of thumb when deciding on the size of the bottleneck layer or is it established empirically?
  
  thanks for your reply,
  Norbert / Giorgia
- June 19, 2016 at 17:31 #3278
  Giorgia M
  Student
  We have run the dnn_bottleneck.py file and stored the model.
  Our understanding is that we need to use the hidden layer trained with the bottleneck model as input to the run_dnn.py. The bottleneck layer is a binary file in a directory: voices/…/LAYER_005_TANH_W.npy. Is it the file that we have to incorporate directly as the input or we have to convert it somehow to the lab format?
  
  In relation to your previous answer, we have included an empty question file in the recipe-config. What does it mean “use the appended_input_dim as bottleneck dimension for the second DNN”? We have created a new variable in the config file, like this “appended_input_dim : 512”
  
  Is this what you meant?
  
  Thanks
- June 22, 2016 at 14:27 #3293
  Srikanth R
  Student
  There are some conflicting things here….I’ll explain them in detailed during my tutoring session on this Friday.
  
  1. “appended_input_dim : 512” is a variable that can be used only with run_lstm.py but not with “run_dnn.py”.
  2. run_dnn_bottleneck.py is an old version of the code but works similar to run_dnn.py (doesn’t work with LSTM architectures)
Author

Posts

Viewing 4 reply threads

You must be logged in to reply to this topic.

Use bottleneck features to train Ossian acoustic model

Search the forums

Note

Latest Activity

Search the forums

Speech Synthesis