› Forums › Speech Synthesis › Evaluation › MDS for richer interpretation of MOS results?
- This topic has 3 replies, 2 voices, and was last updated 8 years, 6 months ago by Simon.
-
AuthorPosts
-
-
February 4, 2016 at 21:56 #2423
Say we carry out a Mean Opinion Score naturalness judgement task comparing two synthetic voices (task A) , and also a paired-stimuli naturalness comparison task (as in Mayo, Clark & King 2005), pooling stimuli from both voices (task B).
Would it be appropriate to interpret the MOS results from task A in light of an MDS analysis carried out on the results of task B?
That is, if MDS identified a dimension corresponding to naturalness, and a second principal dimension strongly corresponding to prosodic naturalness, could we conclude based on this that differences in prosody were the driving factor behind the MOS results?
-
February 7, 2016 at 14:45 #2557
Could you clarify the question a little bit – I’m not sure about “a dimension corresponding to naturalness, and a second principal dimension strongly corresponding to prosodic naturalness”
-
February 7, 2016 at 15:14 #2558
Sorry for being unclear: I wrote the question in a bit of a rush!
The paper notes that this method separates utterances into clusters. Upon examination, there is a cluster of “natural” utterances (6 and 7), a cluster of utterances which are unnatural due to prosody (5 and 1) and a cluster of utterances which are unnatural due to segmental errors (2 and 4). Based on this they conclude that prosodic and segmental errors are two separate dimensions which can influence naturalness judgements.
If such a study were carried out at the same time as a MOS evaluation (on the same participants), and found that, unlike the above example, there were two clusters only: a cluster of unnatural prosody, and another cluster of both natural utterances and segmentally unnatural utterances, would this be good grounds to conclude that the MOS results are based on judgements due to prosody alone?
-
February 7, 2016 at 15:56 #2559
Yes, that would seem a reasonable conclusion. Your hypothetical MDS test has found that listeners only use prosodic naturalness to distinguish between stimuli. Either they do not hear segmental problems, or there are none (it doesn’t matter which).
-
-
AuthorPosts
- You must be logged in to reply to this topic.