Foundations of Statistical Natural Language ProcessingStatistical approaches to processing natural language text have become dominant in recent years. This foundational text is the first comprehensive introduction to statistical natural language processing (NLP) to appear. The book contains all the theory and algorithms needed for building NLP tools. It provides broad but rigorous coverage of mathematical and linguistic foundations, as well as detailed discussion of statistical methods, allowing students and researchers to construct their own implementations. The book covers collocation finding, word sense disambiguation, probabilistic parsing, information retrieval, and other applications. |
Contents
Introduction | 3 |
3 | 24 |
Mathematical Foundations | 39 |
2 | 59 |
1 | 77 |
Linguistic Essentials | 81 |
CorpusBased Work | 117 |
1 | 118 |
Lexical Acquisition | 309 |
Markov Models | 317 |
PartofSpeech Tagging | 341 |
Probabilistic Context Free Grammars | 381 |
Probabilistic Parsing | 407 |
Probabilistic Parsing | 457 |
Statistical Alignment and Machine Translation | 463 |
Clustering | 495 |
Other editions - View all
Foundations of Statistical Natural Language Processing Christopher Manning,Hinrich Schutze Limited preview - 1999 |
Foundations of Statistical Natural Language Processing Christopher Manning,Hinrich Schutze Limited preview - 1999 |
Foundations of Statistical Natural Language Processing Christopher Manning,Hinrich Schutze No preview available - 1999 |
Common terms and phrases
adjectives adverbs algorithm alignment ambiguous word applied approach arg max assume basic Bayes bigrams binomial distribution Brown corpus calculate chapter classification clustering collocations compute context corpora cross entropy defined depends derivation dictionary discussed documents EM algorithm English evaluation example Exercise figure frequency function Good-Turing grammar language model lexical acquisition linguistics look Markov Model matrix maximum likelihood meaning measure methods mutual information n-gram n-gram models node normally noun phrase object occur parameters parser parsing PCFG Penn Treebank predict probabilistic probability distribution probability estimates problem pronoun random variable refer sample score selectional preferences semantic similarity sentence sequence SGML shows space speech Statistical NLP subcategorization frames syntactic tag set tagger test set theory tion training data translation tree Treebank trigram values vector verb Viterbi algorithm word sense disambiguation Zipf's law