Skip to main content

12 Slovene learner corpus KOST

The KOST annotation system was developed together with the KOST corpus of Slovene as a foreign language (Stritar Kučuk 2022) and is designed for categorizing teacher's corrections in texts written by speakers of Slovene as a second or foreign language. The tagging system is hierarchically organized in two tiers: first, the corrections are defined according to the linguistic level, followed by the characterization of the general type of correction or the part of speech. The two-tier annotations allow for a robust analysis, which has to be followed by a more detailed manual revision.