02 Segmentation
Segmentation is the process of dividing text into individual sentences. For the machine annotation of corpora in the Slovenian context, we currently use the CLASSLA-Stanza tagger, more precisely the Obeliks segmentator included in it. The rules guiding the automatic tagging are also adhered to during manual revision.
Introduction to Segmentation
This chapter summarizes the annotation guidelines for sentence segmentation. The main guideline f...
Annotation Guidelines
This chapter summarizes the annotation guidelines for segmentation as applied to Slovene texts. V...
References and Links
This chapter compiles relevant references and provides links to projects where segmentation has b...