- This topic has 3 replies, 3 voices, and was last updated 4 years, 9 months ago by .
Viewing 3 reply threads
Viewing 3 reply threads
- You must be logged in to reply to this topic.
› Forums › Speech Synthesis › The front end › NSWs / Unknown Words
Do unknown words count as non-standard words?
In general, no.
Anything that you might be able to find in a dictionary (if you imagine having a really huge dictionary) is a Standard Word (even if our particular dictionary doesn’t include it).
Another way to decide what is Standard Word might be to say that its pronunciation has to be determined directly from its spelling, using the same method as for all other Standard Words, without any other processing first.
Anything that you would not expect to find in the dictionary, however large it was, is a Non-Standard Word (NSW). These need converting to Standard Word(s) before attempting dictionary lookup – and that process is called normalisation.
For example, I just made up the word “Simonification”. If that got into common usage (it’s possible!) then one day dictionary writers would include it in their dictionaries. So, that’s a Standard Word, even though no current dictionary yet includes it.
In contrast, no dictionary would ever attempt to include
so those are NSWs.
I was wondering if words like e.g. and i.e. are considered like NSWs.
Yes. The tokens “e.g.” and “i.e.”, which are abbreviations for Latin expressions, should be read as letter sequences (LSEQ). Text normalisation can be expected to handle them correctly.
(Since these are very common, many dictionaries will include them as “words”)
Some forums are only available if you are logged in. Searching will only return results from those forums if you log in.
Copyright © 2024 · Balance Child Theme on Genesis Framework · WordPress · Log in