Skip to main content

11 Developmental corpus Šolar

The Šolar annotation system, developed alongside the Slovene Šolar developmental corpus (Arhar Holdt et al. 2022), is designed for categorizing language corrections in texts written by pupils in Slovene primary schools and students in Slovene secondary schools. The system's initial framework for annotating corrections was established in the corpus's first edition (Kosem et al. 2012), and significantly enhanced in its 2.0 version (Kosem et al. 2016), which also saw the initial development of its annotation guidelines (Arhar Holdt et al. 2018). The system is structured hierarchically into three levels: it starts by identifying corrections at the linguistic level, then classifies the general type of correction, and finally pinpoints the specific linguistic issue. This three-tiered tagging approach ensures both robust and nuanced application across various contexts.