Forum Replies Created

Viewing 10 posts - 1 through 10 (of 10 total)

Author

Posts
November 10, 2019 at 21:52 in reply to: Using Praat to label data #10191
Pilar O
Tutor
Thank you!

I attach the script, you need to change the extension to .py and you have to run it like this (tested with python 2):

python txtgrd2lab.py ./txtgrids/ ./labels/

(so the first argument is a folder where you have the textgrids and the second one is the folder where the label files will be saved).

I was able to open it correctly with wavesurfer in my laptop, but please test it with wavesurfer in the lab to see if the labels are correct.

If you find any other errors or problems let me know. By the way, one of your intervals had a missing label.

If the script properly works, it only works with interval type textGrids, so if anybody else wants to label with Praat check your classmate textGrid and do it in the same way, and then you can use this one to convert to wavesurfer labels.

This script has not been properly tested so you might run into errors or problems testing with other data.

Attachments:
You must be logged in to view attached files.
November 9, 2019 at 22:42 in reply to: Using Praat to label data #10185
Pilar O
Tutor
Hello! I wrote a bit of code to convert the format. Would you send me the wav file to see if it works? If you don’t want to/can’t post it here you can send it to my e-mail.
July 3, 2016 at 11:57 in reply to: GENWAV – problem #3317
Pilar O
Tutor
Ok, sorted out. Apparently there was something about my paths that SPTK did not like it so it was not working…
July 3, 2016 at 11:50 in reply to: GENWAV – problem #3316
Pilar O
Tutor
Gracias Norbert, I tried but it does not work. I’ve tried with different versions of SPTK, local and in the network, but I keep getting the same error about the “weight” file.

2016-07-03 11:42:06,644 CRITICAL subprocess: for command: echo 1 1 1.4 1.4 1.4 1.4 1.4 1.4 1.4 1.4 1.4 1.4 1.4 1.4 1.4 1.4 1.4 1.4 1.4 1.4 1.4 1.4 1.4 1.4 1.4 1.4 1.4 1.4 1.4 1.4 1.4 1.4 1.4 1.4 1.4 1.4 1.4 1.4 1.4 1.4 1.4 1.4 1.4 1.4 1.4 1.4 1.4 1.4 1.4 1.4 1.4 1.4 1.4 1.4 1.4 1.4 1.4 1.4 1.4 1.4 | /mnt/courses/ss/dnn/tools/SPTK-3.9/bin/x2x +af > /mnt/courses.homes/s1520337/Desktop/Dissertation_experiments/DNN Acoustic models/DNN try WORLD/gen/DNN_TANH_TANH_TANH_TANH_TANH_TANH_LINEAR__mgc_lf0_vuv_bap_1_3300_373_199_6_1024_1024/weight
2016-07-03 11:42:06,644 CRITICAL subprocess: stderr: Cannot open file Acoustic!
July 2, 2016 at 12:41 in reply to: GENWAV – problem #3313
Pilar O
Tutor
Hi I also have a problem with GENWAV, I pointing the right folder for SPTK, it generated the acoustic parameters in the previous step but now that I want to generate the waveform it can’t because it does not have a “weight” file. This is the error:

2016-07-02 12:06:53,072 CRITICAL subprocess: OSError for echo 1 1 1.4 1.4 1.4 1.4 1.4 1.4 1.4 1.4 1.4 1.4 1.4 1.4 1.4 1.4 1.4 1.4 1.4 1.4 1.4 1.4 1.4 1.4 1.4 1.4 1.4 1.4 1.4 1.4 1.4 1.4 1.4 1.4 1.4 1.4 1.4 1.4 1.4 1.4 1.4 1.4 1.4 1.4 1.4 1.4 1.4 1.4 1.4 1.4 1.4 1.4 1.4 1.4 1.4 1.4 1.4 1.4 1.4 1.4 | /mnt/courses/ss/dnn/tools/SPTK-3.9/bin/x2x +af > /mnt/courses.homes/s1520337/Desktop/Dissertation_experiments/DNN Acoustic models/DNN try WORLD/gen/DNN_TANH_TANH_TANH_TANH_TANH_TANH_LINEAR__mgc_lf0_vuv_bap_1_3300_373_199_6_1024_1024/weight
Traceback (most recent call last):
File “/mnt/courses.homes/s1520337/Documents/dnn_tts/run_lstm.py”, line 1089, in <module>
main_function(cfg)
File “/mnt/courses.homes/s1520337/Documents/dnn_tts/run_lstm.py”, line 928, in main_function
generate_wav(gen_dir, gen_file_id_list, cfg) # generated speech
File “/mnt/courses.homes/s1520337/Documents/dnn_tts/utils/generate.py”, line 171, in generate_wav
.format(line=line, x2x=SPTK[‘X2X’], weight=os.path.join(gen_dir, ‘weight’)))
File “/mnt/courses.homes/s1520337/Documents/dnn_tts/utils/generate.py”, line 90, in run_process
raise OSError

So the questions are:
– Where is this file generated? (and why I don’t have it)
– What information has this weight file?
– How do I fix the error?

Thank you!
April 3, 2016 at 17:24 in reply to: Required files at run time #2944
Pilar O
Tutor
Ok, so I did, when I created the utts files all of them were flagged as bad_pm.
Then when I try to build the LPCs it complains that there are no pitch marks.

I checked “make_lpc” and at the end where it says: “Extract the LPC coefficients” and uses “sig2fv” asks as argument the pitch marks (but not to build the residuals with “sigfilter”).

So, I checked “sig2fv_main.cc” and it says at the introductory comments:

“-pm <ifile> Pitch mark file name. This is used to \n”
” specify the positions of the analysis frames for pitch \n”
” synchronous analysis. Pitchmark files are just standard \n”
” track files, but the channel information is ignored and \n”
” only the time positions are used\n”

Then later, the only place I can see the pitch marks are being used is:

// allocate and fill time axis
if (al.present(“-pm”))
{
if (read_track(full, al.val(“-pm”), al))
exit(1);
}

And given a comment at the end with some examples, I see that we are actually doing: “Pitch Synchronous linear prediction”.

Sooo… I don’t really understand the detail, but:

1. Apparently the Linear prediction analysis is being done at a time step given by the pitch marks, using them as the centre of the analysis windows?

2. so we don’t actually need the pitch marks at run time because the LPC coefficients are made already based on that timing?

3. but then why don’t we have that information for the residuals too if, as you mentioned in other post, the residuals are concatenated using the pitch periods given the pitch marks?

I hope you could explain this whole process in detail, I’m a little confused about this whole issue with pitch marks, LPC, residuals, concatenation, etc., it would be very helpful… thank you!
April 2, 2016 at 21:51 in reply to: Required files at run time #2925
Pilar O
Tutor
Ok, thanks, that explains why we don’t need the f0…
but what about the pitch marks?

Thank you!
March 13, 2016 at 12:39 in reply to: DNN Basics #2782
Pilar O
Tutor
Adding to this topic on basics on NNs…

I don’t understand how people choose the number of hidden layers, the number of units per layer, and the functions to put in them. Is it just a matter of trial and error? For example, in the Zen’s reading a foot note says:

“5 Although the linear activation function is popular in DNN-based regression, our preliminary experiments showed that the DNN with the sigmoid activation function at the output layer consistently outperformed those with the linear one.”

Is there any intuition to choose your functions based on how you think the net should transform your input to get to the desire output? (specifically here for speech synthesis) Or do you try different combinations and after we get the best result we try to understand why that architecture was better?
March 12, 2016 at 20:52 in reply to: Timing a process #2773
Pilar O
Tutor
What if we want to time a process inside Festival, to time how long it takes to generate the utterances?
February 4, 2016 at 10:32 in reply to: Diphone boundaries #2410
Pilar O
Tutor
Is it this function in “strip join cost coef” that’s calculating the middle point?:

def join_point_time(item):
if item.f_present( “cl_end” ):
return item.F( “cl_end” )
elif item.f_present( “dipth” ):
return (0.75*item.F( “start” )) + (0.25*item.F(“end”))
else :
return (item.F( “start” ) + item.F(“end”))/2

Apparently does something different for stops and diphthongs, otherwise it just takes the start and the end, sums and divides by 2 to get the half point.
Author

Posts

Viewing 10 posts - 1 through 10 (of 10 total)

Pilar O

Forum Replies Created

Attachments:

Search the forums

Note

Latest Activity

Search the forums

Speech Synthesis