The frequency of occurrence of each N-gram in a training corpus is used to estimate its probability.
in Dan Jurafsky and James H. Martin “Speech and language processing: an introduction to natural language processing, computational linguistics, and speech recognition”, 2009, Pearson Prentice Hall, Upper Saddle River, N.J., Second edition, ISBN 0135041961
Alternative if you don’t have the 2nd edition of this book: You can read Chapter 3.0-3.1 in J&M 3rd edition: currently available online here
Forum for discussing this reading
Viewing 0 reply threads
Viewing 0 reply threads
- You must be logged in to reply to this topic.