## Foundations of Statistical Natural Language ProcessingStatistical approaches to processing natural language text have become dominant in recent years. This foundational text is the first comprehensive introduction to statistical natural language processing (NLP) to appear. The book contains all the theory and algorithms needed for building NLP tools. It provides broad but rigorous coverage of mathematical and linguistic foundations, as well as detailed discussion of statistical methods, allowing students and researchers to construct their own implementations. The book covers collocation finding, word sense disambiguation, probabilistic parsing, information retrieval, and other applications. |

### Contents

Introduction | 3 |

4 | 6 |

5 | 22 |

Mathematical Foundations | 39 |

2 | 59 |

1 | 77 |

Linguistic Essentials | 81 |

2 | 86 |

ngram Models over Sparse Data | 191 |

Word Sense Disambiguation | 229 |

Lexical Acquisition | 265 |

Markov Models | 317 |

PartofSpeech Tagging | 341 |

Probabilistic Context Free Grammars | 381 |

Probabilistic Parsing | 407 |

Statistical Alignment and Machine Translation | 463 |

### Common terms and phrases

adjectives adverbs algorithm alignment ambiguous word applied approach arg max assume basic Bayes bigrams binomial distribution Brown corpus calculate chapter clustering collocations compute context corpora counts cross entropy depends derivation dictionary discussed documents EM algorithm English evaluation example Exercise figure frequency function Good-Turing grammar language model lexical acquisition linguistics look Markov Model matrix meaning measure methods mutual information n-gram n-gram models node normally noun phrase object occur parameters parser parsing part-of-speech tagging particular PCFG Penn Treebank phrase structure predict probability distribution probability estimates problem pronoun random variable refer sample score selectional preferences semantic similarity sentence sequence shows space speech Statistical NLP subcategorization frames syntactic tag set tagger theory tion training corpus training data transformation-based translation tree Treebank trigram values variance vectors verb Viterbi algorithm word sense disambiguation Zipf's law