Jurafsky & Martin – Section 4.1 – Word Counting in Corpora

The frequency of occurrence of each N-gram in a training corpus is used to estimate its probability.

in Dan Jurafsky and James H. Martin “Speech and language processing: an introduction to natural language processing, computational linguistics, and speech recognition”, 2009, Pearson Prentice Hall, Upper Saddle River, N.J., Second edition, ISBN 0135041961

Alternative if you don’t have the 2nd edition of this book: You can read Chapter 3.0-3.1 in J&M 3rd edition: currently available online here

Forum for discussing this reading

Viewing 0 reply threads

Author

Posts
- November 1, 2016 at 11:53 #5758
  Simon King
  Professor
  N-Grams
Author

Posts

Viewing 0 reply threads

You must be logged in to reply to this topic.

This reading is
Very useful		0
Somewhat useful		3
Confusing		0

Jurafsky & Martin – Section 4.1 – Word Counting in Corpora

Forum for discussing this reading

Search the forums

Note

Latest Activity

Search the forums

Speech Synthesis