# References and Links

This chapter compiles relevant references and provides links to projects where the lemmatization of Slovene has been developed and applied to Slovene texts.<br />

**Projects, in which the system has been developed:<br />**
[JOS - Linguistic Annotation of Slovene: Methods and Resources](http://nl.ijs.si/jos/index-en.html)<br />
[Communication in Slovene](http://eng.slovenscina.eu/)<br />
[Janes - Resources, Tools and Methods for the Research of Nonstandard Internet Slovene](https://nl.ijs.si/janes/english/)<br />
[Development of Slovene in a Digital Environment](https://slovenscina.eu/en)<br />

**Training corpora containing manually revised lemmas:<br />**
Arhar Holdt, Špela; Krek, Simon; Dobrovoljc, Kaja; Erjavec, Tomaž; Gantar, Polona; Čibej, Jaka; Pori, Eva; Terčon, Luka; Munda, Tina; Žitnik, Slavko; Robida, Nejc; Blagus, Neli; Može, Sara; Ledinek, Nina; Holz, Nanika; Zupan, Katja; Kuzman, Taja; Kavčič, Teja; Škrjanec, Iza; Marko, Dafne; Jezeršek, Lucija; Zajc, Anja, 2022, Training corpus SUK 1.0, Slovenian language resource repository CLARIN.SI, ISSN 2820-4042, [http://hdl.handle.net/11356/1747](http://hdl.handle.net/11356/1747).<br />

Lenardič, Jakob; Čibej, Jaka; Arhar Holdt, Špela; Erjavec, Tomaž; Fišer, Darja; Ljubešić, Nikola; Zupan, Katja; Dobrovoljc, Kaja, 2022, CMC training corpus Janes-Tag 3.0, Slovenian language resource repository CLARIN.SI, ISSN 2820-4042, [http://hdl.handle.net/11356/1732](http://hdl.handle.net/11356/1732).

**References:<br />**
Erjavec, Tomaž; Fišer, Darja; Krek, Simon in Ledinek, Nina. 2010. The JOS Linguistically Tagged Corpus of Slovene. V: Proceedings of the Seventh conference on International Language Resources and Evaluation (LREC’10), Valeta, Malta, Maj. European Language Resources Association (ELRA). [http://www.lrec-conf.org/proceedings/lrec2010/pdf/139_Paper.pdf](http://www.lrec-conf.org/proceedings/lrec2010/pdf/139_Paper.pdf) <span style="color:white">tag</span> [[PDF]](https://wiki.cjvt.si/attachments/46)<br />

Erjavec, Tomaž. 2012. MULTEXT-East: morphosyntactic resources for Central and Eastern European languages. Language Resources and Evaluation, 46(1): 131–142. DOI: [10.1007/s10579-011-9174-8](https://doi.org/10.1007/s10579-011-9174-8) <span style="color:white">tag</span> [[PDF]](https://wiki.cjvt.si/attachments/47)<br />

Krek, Simon; Erjavec, Tomaž; Dobrovoljc, Kaja; Gantar, Polona; Arhar Holdt, Špela; Čibej, Jaka in Brank, Janez. The ssj500k training corpus for Slovene language processing. V: Fišer, D. in Erjavec, T. Jezikovne tehnologije in digitalna humanistika: zbornik konference: 24.-25. september 2020, Ljubljana, Slovenija. Ljubljana: Inštitut za novejšo zgodovino, 2020. Str. 24–33.	 [http://nl.ijs.si/jtdh20/pdf/JT-DH_2020_Krek-et-al_The-ssj500k-Training-Corpus-for-Slovene-Language-Processing.pdf](http://nl.ijs.si/jtdh20/pdf/JT-DH_2020_Krek-et-al_The-ssj500k-Training-Corpus-for-Slovene-Language-Processing.pdf) <span style="color:white">tag</span> [[PDF]](https://wiki.cjvt.si/attachments/48)<br />