Trigram tagger

In computational linguistics, a trigram tagger is a statistical method for automatically identifying words as being nouns, verbs, adjectives, adverbs, etc. based on second order Markov models that consider triples of consecutive words. It is trained on a text corpus as a method to predict the next word, taking the product of the probabilities of unigram, bigram and trigram. In speech recognition, algorithms utilizing trigram-tagger score better than those algorithms utilizing IIMM tagger but less well than Net tagger.

The description of the trigram tagger is provided by Brants (2000).

References

Kempe Andre (1993). "A stochastic Tagger and an Analysis of Tagging Errors". Internal paper. Institute for Computational Linguistics, Universität Stuttgart.
Brants, T. (2000) TnT - A Statistical Part-of-Speech Tagger, Proc 6th Applied Natural Language Processing Conference, ANLP-200

External links

TnT -- Statistical Part-of-Speech Tagging by Thorsten Brants

This article is issued from Wikipedia - version of the 10/9/2014. The text is available under the Creative Commons Attribution/Share Alike but additional terms may apply for the media files.