Forum Replies Created
-
AuthorPosts
-
Thanks! Mixing my arctic recordings with my domain (parliamentary speech) worked to fix the gibberish output.
1) But why? My domain script may be fewer in sentences but because they were long, the no. of total words in my domain script was more or less the same as arctic A. How can the same amount of training data (recordings) in one case, Arctic A, produce a comprehensible voice but not in another case (my domain)?
If that’s the case, would simply doubling the size of my current domain script (so that would be twice as many words as Arctic A) help in making a voice with that domain script alone (i.e. w/o mixing my arctic A recordings as I’ve just done)? Ideally, I would not like to mix my domain + arctic A data.
I’m inclined to think it’s not the training data size that is causing misalignment here because if that is the case, why are others having misalignment issues on their Arctic A voices themselves…? Assuming they’re using the same Arctic A script-recordings w/593 lines that was used in the default voice (section 5 on the exercise) for this assignment (wrt Noe’s reply below). And that default voice seems to run without misalignment.
Thanks!
Thanks! Just to clarify, you mean add the utterances from arctic’s utts.data into my domain’s utts.data and the corresponding arctic .wav files to my ‘wav’ folder’ as well, before running do_alignment, correct?
It has to do with alignment. I looked at a few of my utts + their .lab files on wavesurfer and some labels for words are missing entirely. The time stamps for when I say “Documents” in one utt, for example, has no non-sp/non-sil labels (just ‘sil’; see attached image).
I spoke to two others who had similar issues and what solved it for them was redoing the whole exercise from scratch (from step 1: downloading ss.zip until the final ‘run the voice’ step). I just did that and am still having this issue – I checked my utts.data file to make sure it’s formatted the same way as arctic’s. I’ll try redoing it but before I do, I think it’d help to know why or which step in the ‘building the voice’ process before do_alignment is causing alignment to go badly.
What could cause some words to not be labelled at all in the do_alignment and the next break_mlf alignment/aligned.3.mlf lab steps? (for words like “documents”, which isn’t oov)
What does this line mean (while running do_alignment, because I see it a lot):
WARNING [-8232] ExpandWordNet: Pronunciation 1 of sp is ‘tee’ word in HViteThanks!
Attachments:
You must be logged in to view attached files.Hi,
May I please know what exactly this does in step “Doing the alignment”?
make_mfcc_list ../mfcc ../utts.data train.scp
It runs without error (my output is attached), but I’m trying to make sense of it because my voice is messed up and I’m not sure what went wrong.
Attachments:
You must be logged in to view attached files.Thanks!
I tried that and rm -rf ~/.cache seems to have cleared cache in the end.
Hi,
Reviving this thread because I have the same issue (see attached). I’ve tried the rm command to clear my trash. In my ss folder, I have 1 copy of my Arctic A (wav and txt) and 1 copy of unprocessed corpus (7mil sentences, in txt). I think they are the largest files there…Should I cut down the corpus to half (it’s 860mb) but I don’t know if that will make sufficient disk space?
I’m using a PPLS computer – is there any form I have to submit to ask for more disk space (like with the informatics suggestion above).
Attachments:
You must be logged in to view attached files. -
AuthorPosts