- This topic has 1 reply, 2 voices, and was last updated 8 years, 11 months ago by .
Viewing 1 reply thread
Viewing 1 reply thread
- You must be logged in to reply to this topic.
› Forums › Speech Synthesis › Evaluation › Word transcription: homophones and typos
When we ask listeners to perform word transcription tasks, do we first convert their transcriptions into phonetic symbols, and evaluate the accuracy of this phonetic transcription instead? Because I would assume that homophones will be hard to be transcribed exactly the same as the text especially in SUS.
This is a good point, and one we do indeed have to confront in a practical test. In a SUS test, we compare words (not their pronunciations). It is therefore appropriate to allow listeners to type in homophones, or indeed to mis-spell words.
There is usually either some pre-processing of the typed-in responses, before we compute the Word Error Rate (WER), or we allow for these mismatches when performing the dynamic programming alignment as part of the WER computation. This might be achieved by creating lists of acceptable matches for each word in the correct transcription, such as
correct word: your
allowable responses: your, you’re, yore youre
Such lists need updating for each listening test (after gathering the listeners’ responses) because listeners seem to be very good at finding new ways to mis-spell or mis-type words!
I’ve attached an example list of acceptable variants for a set of Semantically Unpredictable Sentences, taken from the tools used to run the Blizzard Challenge.
Some forums are only available if you are logged in. Searching will only return results from those forums if you log in.
Copyright © 2025 · Balance Child Theme on Genesis Framework · WordPress · Log in