Syllable structure & stress

This topic has 4 replies, 5 voices, and was last updated 7 years, 8 months ago by Simon.

Viewing 1 reply thread

Author

Posts
- October 23, 2015 at 17:40 #399
  Merce V
  Student
  I am aware that Festival uses 0, 1 and 2 to refer to lexical stress (unstressed, primary stress and secondary stress). However, when examining the lexical stress for the word “upset” (in my sentence, used as a noun), Festival appends a 3 on the second syllable. What does this mean?
- October 23, 2015 at 20:26 #400
  Simon
  Professor
  It’s tertiary stress, which is marked up in the Unisyn lexicon – see Section 3.4.3 of the Unisyn manual. Tertiary stress is essentially there not to show that a syllable might receive a pitch accent, but to block some post lexical rules, such as vowel reduction.
  
  So, the second syllable in “upset” should never be reduced, in any context. I think Unisyn would regard “upset” as a compound word “up + set”, which is why the tertiary stress is marked up.
- October 3, 2016 at 10:47 #5061
  Lovisa W
  Student
  I was wondering about the syllable structure of the example word in the Week 3 lab – ‘caterpillar’. Is this done automatically or by hand? I am confused as to why the ‘p’ is part of the second syllable – normally I think it would be segmented as being the onset of the third syllable. Are there any situations which could arise where an incorrect or strange syllabification could cause problems in the synthesising of a word or sentence?
- October 3, 2016 at 17:57 #5067
  Simon
  Professor
  In Festival, you can detect when a pronunciation has come from the dictionary: it will have a correct Part Of Speech (POS) tag. Pronunciations predicted by the Letter To Sound (LTS) module have a ‘nil’ part of speech tag.
  
  The example of caterpillar here returns a nil POS tag.
  
  The syllabification of words whose pronunciation come from LTS, must also be made automatically, and therefore can contain errors.
  
  An incorrect syllabification could indeed have consequences for speech synthesis later in the pipeline. It might affect the prediction of prosody. In unit selection, it might affect the units chosen from the database.
- October 22, 2016 at 12:51 #5539
  Simon V
  Student
  I know where to find explanations for most of the different steps from text to speech but I can’t seem to find a source on how Festival comes up with syllable structure and stresses for words that are not in the dictionary.
- October 23, 2016 at 18:41 #5563
  Simon
  Professor
  For the voice used in this assignment, this is done by rules hardwired into the low-level C++ code, which are specific to the Unilex dictionary.
  
  (You are not expected to be able to read or understand the code, but feel free to try).
  
  EDIT – see below for a more detailed answer explaining what the rules do.
- October 5, 2017 at 16:41 #7859
  Gonzalo V
  Student
  How do Festival and the lexicon ‘know’ where is the stressed syllable in a made-up word such as foreign names –maybe by means of an accent grammar?
- October 5, 2017 at 21:17 #7863
  Simon
  Professor
  Syllabification of out-of-dictionary words is rule-based, using sonority. Every vowel is assumed to be the nucleus of a syllable. The boundaries between syllables are placed at positions of minimum sonority.
  
  This requires knowing the sonority of every phoneme in the set used by the current lexicon. In Festival, sonority is calculated from the broad phonetic class.
  
  A good reference for sonority would be this classic textbook
  
  Giegerich, H. J. (1992) “English Phonology: an Introduction” Cambridge University Press, Cambridge, UK.
  
  (Heinz Giegerich is the Professor of English Linguistics at Edinburgh University)
- October 5, 2017 at 21:26 #7864
  Simon
  Professor
  If you’re interested in how sonority is calculated from broad phonetic class in Festival, this is hard-coded as follows:
```
    if (p->val(f_vc) == "+") // vowel-or-consonant == vowel
        return 5;
    else if (p->val(f_ctype) == "l") // consonant-type == liquid
        return 4;
    else if (p->val(f_ctype) == "n") // consonant-type == nasal
        return 3;
    else if (p->val(f_cvox) == "+") // consonant-voicing == voiced
        return 2;
    else
        return 1;
```
  and the phoneme set used by the lexicon will have those features specified in a table (manually created by a phonetician) looking something like this example (which happens to be for Spanish):
```
   (#  - 0 - - - 0 0 -)
   (a  + l 3 1 - 0 0 -)
   (e  + l 2 1 - 0 0 -)
   (i  + l 1 1 - 0 0 -)
   (o  + l 3 3 - 0 0 -)
   (u  + l 1 3 + 0 0 -)
   (b  - 0 - - + s l +)
   (ch - 0 - - + a a -)
   (d  - 0 - - + s a +)
   (f  - 0 - - + f b -)
   (g  - 0 - - + s p +)
   (j  - 0 - - + l a +)
   (k  - 0 - - + s p -)
   (l  - 0 - - + l d +)
   (ll - 0 - - + l d +)
   (m  - 0 - - + n l +)
   (n  - 0 - - + n d +)
   (ny - 0 - - + n v +)
   (p  - 0 - - + s l -)
   (r  - 0 - - + l p +)
   (rr - 0 - - + l p +)
   (s  - 0 - - + f a +)
   (t  - 0 - - + s t +)
   (th - 0 - - + f d +)
   (x  - 0 - - + a a -)
```
Author

Posts

Viewing 1 reply thread

You must be logged in to reply to this topic.

Syllable structure & stress

Search the forums

Note

Latest Activity

Search the forums

Speech Synthesis