in Dan Jurafsky and James H. Martin “Speech and language processing: an introduction to natural language processing, computational linguistics, and speech recognition”, 2009, Pearson Prentice Hall, Upper Saddle River, N.J., Second edition, ISBN 0135041961
Jurafsky & Martin - Section 4.1 - Word Counting in Corpora
The frequency of occurrence of each N-gram in a training corpus is used to estimate its probability.
Jurafsky & Martin - Section 4.2 - Simple (Unsmoothed) N-Grams
We can just use raw counts to estimate probabilities directly.
Jurafsky & Martin - Section 4.3 - Training and Test Sets
As we should already know: in machine learning it is essential to evaluate a model on data that it was not learned from.
Jurafsky & Martin - Section 4.4 - Perplexity
It is possible to evaluate how good an N-gram model is without integrating it into an automatic speech recognition. We simply measure how well it predicts some unseen test data.
Forum for discussing this reading
- You must be logged in to reply to this topic.