Skip to main content

04 MULTEXT-East Morphosyntax

The MULTEXT-East framework for morphosyntactic annotation of text corpora defines character codes, referred to as MSD-tags (with 'MSD' standing for morphosyntactic description). For example, the "Ncmsn" tag represents a set of grammatical features "Noun Type=common Gender=masculine Number=singular Case=nominative". This annotation system has been established for 20 languages or dialects, including all Slavic languages.
The use of MULTEXT-East tags for Slovene began in 1996 and has since continued in all subsequent open corpora of Slovene, whether manually or automatically annotated, up until the emergence of the Universal Dependencies morphosyntactic annotation framework, which is now gradually taking over the role that MULTEXT-East played for decades.