- This topic has 3 replies, 3 voices, and was last updated 8 years, 8 months ago by .
Viewing 3 reply threads
Viewing 3 reply threads
- You must be logged in to reply to this topic.
› Forums › Speech Synthesis › Evaluation › Scoring SUS Test for Intelligibility
The Benoit paper states they scored sentence transcriptions as “entirely correct” or not, allowing for homophones. I’m not sure this is the most appropriate method for scoring my SUS tests for a few reasons…
Very few sentences have been transcribed as ENTIRELY correct, but many are correct with the exception of one or two words. Some mistakes are fairly minor, such as “fleet ribbons” becoming “flee tribbins”. If I only score on sentences which are entirely correct, I believe it will be a misrepresentation of the results and make them look worse than they are.
What is a good halfway measure here? I could score for correct words, correct phonemes, etc. but then I suppose I would need to normalize for sentence length. Would that be an appropriate way to score?
In the Blizzard Challenge, and almost everywhere else, Word Error Rate (WER) is used. I have rarely, if ever, seen anyone following the recommendation from the original paper to score entire sentences.
With WER, there is no need to normalise for sentence length. Just use the same formula that is used in automatic speech recognition.
Scoring SUS results using WER, I guess we should still allow for homophones, as Benoit et al suggested in their way of scoring. Is this right?
The argument should always be that in real (semantically meaningful) speech we actually distinguish homophones through context, which is absent in the SUS stimuli.
Yes, you need to allow for homophones, and I also recommend allowing for spelling mistakes as well: you are testing your systems, not testing the listeners.
Some forums are only available if you are logged in. Searching will only return results from those forums if you log in.
Copyright © 2024 · Balance Child Theme on Genesis Framework · WordPress · Log in