camel_tools.tagger.default

contains the CAMeL Tools default tagger.

Classes

class camel_tools.tagger.default.DefaultTaggerError

Base class for errors raised by DefaultTagger.

class camel_tools.tagger.default.InvalidDefaultTaggerDisambiguator

Error raised when a DefaultTagger is initialized with an object that object does not implement Disambiguator.

class camel_tools.tagger.default.InvalidDefaultTaggerFeature(feature)

Error raised when a DefaultTagger is initialized with an invalid feature name.

class camel_tools.tagger.default.DefaultTagger(disambiguator, feature)

The default camel_tools tagger. It generates tags for a given feature by first disambiguating a word using a given disambiguator and then returning the associated value for that feature. It also provides sensible default values for when no analyses are generated by the disambiguator or when a feature is not present in the disambiguation.

Parameters:
  • disambiguator (Disambiguator) – The disambiguator used for disambiguating input.
  • feature (str) – The feature to be produced.
Raises:
static feature_list()

Returns list of valid features producible by DefaultTagger.

tag(sentence)

Generate a tag for each token in a given sentence.

Parameters:sentence (list of str) – The sentence to be tagged.
Returns:The list of tags corresponding to each token in sentence.
Return type:list

Features

The list of features that can be produced by DefaultTagger are: 'diac', 'bw', 'asp', 'cas', 'gen', 'mod', 'num', 'per', 'pos', 'enc0', 'enc1', 'enc2', 'prc0', 'prc1', 'prc2', 'prc3', 'form_num', 'form_gen', 'stt', 'vox', 'atbtok', 'atbseg', bwtok, 'd1tok', 'd1seg', 'd2tok', 'd2seg', 'd3tok', 'd3seg', 'catib6', 'ud', 'caphi'.

See See CAMeL Morphology Features for more information on features and their values.

Examples

from camel_tools.disambig.mle import MLEDisambiguator
from camel_tools.tagger.default import DefaultTagger

mled = MLEDisambiguator.pretrained()
tagger = DefaultTagger(mled, 'pos')

tagger.tag('ذهبت الى المدرسة'.split())