Page 2

Forum Replies Created

Viewing 15 posts - 16 through 30 (of 33 total)

← 1 2 3 →

Author

Posts
March 25, 2021 at 17:37 in reply to: Can’t install numpy, matplotlib or even upgrade py3.6 #13916
Korin Richmond
Professor
The lab machines have Anaconda installed, with Python 3.7 and a large number of packages pre-installed.

You can find that version of Python at:

/anaconda3/bin/python3

Finally, if you ever need you install packages that aren’t pre-installed in that Python installation, you can always first just create a virtual environment (or alternatively use a conda env) – you can then install packages into this virtualenv without needed root privileges.
March 19, 2021 at 15:11 in reply to: Join cost as highest sub-cost instead of weighted sum #13898
Korin Richmond
Professor
That calculation is hardcoded in the C++ (because the time taken for calculating join costs pretty much dominates over everything else in terms of runtime).

It wouldn’t actually be that hard to change it – the code responsible in file

$FESTIVALDIR/src/modules/MultiSyn/EST_JoinCost.h

But, you would of course have to recompile festival to make your own modified executable that would have any changes you might want to make – *that* could be more or less time-consuming, depending on prior experience with C++ and compiling festival etc.
March 19, 2021 at 15:02 in reply to: audio files for qualtrics surveys #13897
Korin Richmond
Professor
What was the secret to getting it working in the end? 😀
March 18, 2021 at 17:43 in reply to: audio files for qualtrics surveys #13894
Korin Richmond
Professor
One other thing to add – some have suggested you need to use https:// links to the wave files, wherever you put them, in your Qualtrics test rather than http:// links

I’d be interested to hear whether + how you get this working!
March 16, 2021 at 16:40 in reply to: Pitch marks in Praat? #13885
Korin Richmond
Professor
I’m not a Praat user at all, so I cannot offer any advice on that I’m afraid.

With wavesurfer it is pretty easy though. Once you have converted your pitchmarks to the label file format (e.g. make_pmlab_pm pm/*.pm), just open Wavesurfer and load the wave file you want to view. You can either:

1) Just choose “Transcription” as the configuration option when initially opening the wave file. Wavesurfer will show the spectrogram and a transcription pane. To load the labels, right click in the transcription pane and select “load labels” and navigate to the correct label file.

2) Add labelling to any open view by right clicking to create another pane – choose Transcription. Then right click in that new pane to load the labels as above.

Tips: 1) if the “.lab” file is in the same directory as the corresponding .wav file, Wavesurfer will just load the labels automatically; 2) right click on the transcription pane and choose “properties” – you can then select to “Extend boundaries into waveform and spectrogram panes”, which can make label viewing better.
March 16, 2021 at 14:51 in reply to: No command found for ch_wave and setup_alignment #13883
Korin Richmond
Professor
Did you remember to rsync the extra files (~1GB) from the AT Lab machines to your virtual machine? The manifest.txt for that includes the tools and data for the Speech Synthesis course.

For the full details see: https://speech.zone/courses/speech-processing/module-3-speech-synthesis-front-end-1/tutorial-b/
March 16, 2021 at 10:38 in reply to: No command found for ch_wave and setup_alignment #13878
Korin Richmond
Professor
Are you encountering this problem when using the ATLab virtual machine image, or remote desktop access to the AT lab machines?

If it is the latter, I have also noticed some filesystem glitches in the past few days (I was unable to run festival) – I reported it to is.helpline@ed.ac.uk and they seemed to fix it.

Anyway, taking ch_wave for example, you should be able to find it here:

[korin@PPLS_ATL_0011 ~]$ which ch_wave
/Volumes/Network/courses/ss/festival/festival_linux/speech_tools/bin/ch_wave

Finding which part of that path is missing will indicate exactly what the problem is.
February 22, 2021 at 09:28 in reply to: What is Found Data? #13849
Korin Richmond
Professor
“Found data” is anything you can get your hands on which was created for another purpose. For example, using YouTube videos for training a speech synthesis model would be using “found data”. It contrasts with data that has been purpose-designed and recorded specifically for building a speech synthesis voice.
February 4, 2021 at 16:07 in reply to: Customizing localdir_multisyn-gam.scm #13822
Korin Richmond
Professor
Each voice has a scheme file that contains code to define how to set up the voice (e.g. which lexicon to use, where data files are, what data to load, etc…)

In the case of the SpeechSynthesis assignment, we’ve created a voice definition file which makes a voice out of data found in the current working directory (so you have to be in the directory to run any particular voice you build).

However, the extra step that’s needed is to register any voice with festival, so it knows that voice is available. You can do that for example by putting the voice definition in a standard place in the $FESTIVAL/lib directory. Alternatively, you can use the function “voice-location-multisyn” to register a voice that is found in a non-standard place. (see $FESTIVAL/lib/voices.scm for details on that)
January 29, 2021 at 17:32 in reply to: Hunt and Black #13808
Korin Richmond
Professor
1. For the ASF features – pitch, power and duration are mentioned (“…each target phoneme has a target pitch, power and duration”)

2. Yes, place of articulation is indeed a linguistic (i.e. articulatory phonetic) feature – it’s a property of a *phone*, which is a linguistic *concept* rather than a physically measurable signal, for example. In contrast, the f0 or power or duration used as ASF feature are something you can directly observe and measure in the acoustic signal (or derivations thereof).

3. Yes, when combining different subcosts (e.g. ASF and/or IFF ones) we would typically want to weight them, so we can balance their influence in the overall cost.
April 14, 2020 at 10:54 in reply to: How to enable phone joins in multisyn #11166
Korin Richmond
Professor
To change the default behaviour of inserting silence when a missing diphone occurs across a word boundary, you would need to edit the C++ code and recompile festival (i.e. IIRC, this behaviour is hard-coded in C++ and not accessible in Scheme).
April 5, 2018 at 17:51 in reply to: Join cost in Multisyn #9229
Korin Richmond
Professor
Yes, just 1 frame either side.

(The code is the ultimate documentation, and the code definitely says 1 frame!)
March 26, 2018 at 14:45 in reply to: Unknown Label error #9168
Korin Richmond
Professor
Yes, Simon’s right, this indicates you’ve done something like use the wrong lexicon at some stage.

Incidentally, you probably won’t find *_cl phones in the my_lexicon.scm file.

Actually, FYI, the *_cl symbols are only used in the force alignment process. Stops (e.g. p, b, t, k…) are broken into two parts, one for the closure portion (e.g. p_cl, b_cl, t_cl, k_cl…) and one for the release. When you build utterances from the final MLF file, these are merged back together again in Festival’s Segment relation, but the boundary between them is used to record the diphone boundary join point…

(I’m not sure which lexicon could have a Q though! Unless it stands for glottal stop…)
February 27, 2018 at 14:20 in reply to: Bulk processing of text in Festival #9085
Korin Richmond
Professor
Does the script run at all? Is there a particular sentence it breaks on? Or perhaps, can you provide a minimal example of code that exhibits the problem?

“SIOD ERROR: wrong type of argument to get_c_val” is a rather generic error – it could be cropping up in a large number of ways – it just means some function is receiving an argument that is different from what it expects. So it’s impossible to tell what’s going wrong without more information.

Thanks + regards,

Korin
February 27, 2018 at 12:01 in reply to: 'The' diphone problem #9082
Korin Richmond
Professor
Don’t worry, you are only changing the backoff rules data structure in memory of the currently running festival process – nothing is permanently changed. And you can easily reinstate the rules for the current festival session by running the above command, or to be specific
```
(du_voice.setDiphoneBackoff currentMultiSynVoice (append ‘((ii @)) unilex-edi-backoff_rules) )
```
unilex-edi-backoff_rules is just a list variable – you can examine its contents by just putting the variable name at the command prompt:
```
festival> unilex-edi-backoff_rules
```
then hit return.
Author

Posts

Viewing 15 posts - 16 through 30 (of 33 total)

← 1 2 3 →

Korin Richmond

Forum Replies Created

Search the forums

Note

Latest Activity

Search the forums

Speech Synthesis