Forum Replies Created

Viewing 15 posts - 1 through 15 (of 25 total)

1 2 →

Author

Posts
November 8, 2016 at 13:18 in reply to: Issues with make composition of acoustic features #5995
Srikanth R
Student
The dimension of your acoustic feature is not 5. Check which features, you are using and it’s dimension.
August 3, 2016 at 12:43 in reply to: Training details #4057
Srikanth R
Student
1) Yes, if you use run_dnn.py — then it automatically uses 0.5 times the learning rate for top hidden layers (>4). It’s not the same case for run_lstm.py

2) they are not completely random, they use gaussian random generator with zero mean and variance based on input size.
https://www.random.org/gaussian-distributions/ (something like this)

3) drop-out is coded in run_lstm.py with variable “dropout_rate” — by default set to 0 — you can add this variable to your configuration file under “Architecture” and fine-tune.
August 2, 2016 at 11:49 in reply to: Issues with model name (label dimension) #4031
Srikanth R
Student
You can’t use same configuration file for both acoustic and duration model.

Please check below file for duration model configuration:
https://svn.ecdf.ed.ac.uk/repo/inf/dnn_tts/configuration/duration_configfile.conf

Please check below file for acoustic model configuration:
https://svn.ecdf.ed.ac.uk/repo/inf/dnn_tts/configuration/acoustic_configfile.conf
July 18, 2016 at 21:50 in reply to: Epoch number #3721
Srikanth R
Student
warmup_epoch is usually set at 10. The learning rate remains same as the value you have set until warmup_epoch (i.e., 10) and then the learning rate becomes half by every epoch thereafter.

If warmup_epoch is used, then the number of epochs can be set any value — and then the network training stops after certain number of epochs, if validation error is not improving.
July 18, 2016 at 16:16 in reply to: Epoch number #3677
Srikanth R
Student
You are not using warmup_epoch? why?

At the moment, the graph shows that the training may lead to over fitting if you extend beyond 20 epochs.
July 14, 2016 at 15:37 in reply to: Postfilter #3330
Srikanth R
Student
Yes, it is some what tuned for STRAIGHT (in fact, it started with STRAIGHT and extended to WORLD) and the parameters differ w.r.t sampling frequency of waveform generation.

We are measuring MCD before applying the filter. As the post-filtering results are stored in a different file with extension ‘.p_mgc’
July 5, 2016 at 15:37 in reply to: How to interpret norm_info…MVN.dat #3321
Srikanth R
Student
There is no need to specify dimension to view the file, just use below command:
$SPTK/bin/x2x +fa norm_info…MVN.dat > temp.txt

‘temp.txt’ file then contains exactly double the number of values as the dimension shown within the filename.

First N values represent means and the next N values represent variances.

In your case, it should have 500 values. To extract the mean and variance of 250th unit, you need to consider 250, 500th lines for mean and variance respectively.
July 3, 2016 at 12:20 in reply to: Synthesis using bottlenecks #3318
Srikanth R
Student
you should use bottleneck model first to generate bottleneck features and append them to input and then use the second model to synthesize speech.
June 22, 2016 at 14:45 in reply to: File Exists Error #3295
Srikanth R
Student
If the number of frames in the input file and output file doesn’t match, the same error appears along with another error indicating a mismatch in the frame length. So, please check other errors as well if appeared any.

Please attend the upcoming session on Merlin where I’ll explain step-by-step procedure to implement/debug duration modelling.
June 22, 2016 at 14:37 in reply to: Bottleneck features as additional input to DNN #3294
Srikanth R
Student
Answer to Q1: run_lstm.py has a well-defined variable for additional input (appended_input_dim) but run_dnn.py is used to be hard-coded for the combined input dimension. The output remains same: mgc with 60 and dmgc with 180.

Answer to Q2: At the moment, the bottleneck code is not completely implemented within Merlin. Yes, we need to have different folders: one for the label input and one for the bottleneck. The stacking of features is done outside Merlin, using independent scripts.
June 22, 2016 at 14:27 in reply to: Use bottleneck features to train Ossian acoustic model #3293
Srikanth R
Student
There are some conflicting things here….I’ll explain them in detailed during my tutoring session on this Friday.

1. “appended_input_dim : 512” is a variable that can be used only with run_lstm.py but not with “run_dnn.py”.
2. run_dnn_bottleneck.py is an old version of the code but works similar to run_dnn.py (doesn’t work with LSTM architectures)
June 22, 2016 at 14:10 in reply to: GENWAV – problem #3292
Srikanth R
Student
As I can see, you haven’t configured the SPTK tool path in the configuration file, as clearly shown in the error:

“/this/path/does/not/exist/x2x +af *”
subprocess: stderr: /bin/sh: /this/path/does/not/exist/x2x: No such file or directory

Please change the path to SPTK tools in configuration files and re-run the GENWAV step.
June 21, 2016 at 16:23 in reply to: Problems about nnets_model #3286
Srikanth R
Student
check for a if condition,which says “save the model only after epoch 5”. comment out that line thereby enabling save model at every epoch.

if epoch > 5: ## comment this line and use small learning rate.

Also, can you paste the validation and training errors somewhere in pastebin and give the link here.
June 18, 2016 at 19:12 in reply to: Lexicon implementation #3275
Srikanth R
Student
I am not sure about lexicon implementation in Ossian.

But, instead of modifying the existing full-contextual label file, you can exclude any questions (remove those lines from question file) which you don’t want to be part of the training in either HMM/DNN.
June 18, 2016 at 19:06 in reply to: Use bottleneck features to train Ossian acoustic model #3274
Srikanth R
Student
If you want to run the bottleneck features, you can do it without that import. So, please do comment all those imports.

The bottleneck system has to be trained first and then use the appended_input_dim as bottleneck dimension for the second DNN.
Author

Posts

Viewing 15 posts - 1 through 15 (of 25 total)

1 2 →

Srikanth R

Forum Replies Created

Search the forums

Note

Latest Activity

Search the forums

Speech Synthesis