Jurafsky & Martin – Chapter 4

Jurafsky & Martin – Chapter 4 – N-Grams

A simple and effective way to model language is as a sequence of words. We assume that the probability of each word depends only on the identity of the preceding N-1 words.

in Dan Jurafsky and James H. Martin “Speech and language processing: an introduction to natural language processing, computational linguistics, and speech recognition”, 2009, Pearson Prentice Hall, Upper Saddle River, N.J., Second edition, ISBN 0135041961

Jurafsky & Martin - Section 4.1 - Word Counting in Corpora
The frequency of occurrence of each N-gram in a training corpus is used to estimate its probability.
Jurafsky & Martin - Section 4.2 - Simple (Unsmoothed) N-Grams
We can just use raw counts to estimate probabilities directly.
Jurafsky & Martin - Section 4.3 - Training and Test Sets
As we should already know: in machine learning it is essential to evaluate a model on data that it was not learned from.
Jurafsky & Martin - Section 4.4 - Perplexity
It is possible to evaluate how good an N-gram model is without integrating it into an automatic speech recognition. We simply measure how well it predicts some unseen test data.

Forum for discussing this reading

Viewing 0 reply threads

Author

Posts
- November 1, 2016 at 11:53 #5758
  Simon
  Professor
  N-Grams
Author

Posts

Viewing 0 reply threads

You must be logged in to reply to this topic.

Jurafsky & Martin – Chapter 4 – N-Grams

Jurafsky & Martin - Section 4.1 - Word Counting in Corpora

Jurafsky & Martin - Section 4.2 - Simple (Unsmoothed) N-Grams

Jurafsky & Martin - Section 4.3 - Training and Test Sets

Jurafsky & Martin - Section 4.4 - Perplexity

Forum for discussing this reading

Search the forums

Note

Latest Activity

Search the forums

Speech Synthesis