Advanced Search
Search Results
133 total results found
Označevalne smernice
V tem poglavju so zbrane označevalne smernice KOST. Smernice so razvrščene od nastarejše različice do zadnje, ažurne različice. Različica 1.0 (04-2022) projekt Razvoj slovenščine v digitalnem okolju STRITAR KUČUK, Mojca, 2023: KOST 1.0: Priročnik za označevanj...
Reference in povezave
V tem poglavju so zbrane relevantne reference in povezave na projekte, v katerih se je označevalni sistem razvijal ter uporabljal. Projekti, na katerih se je razvijal označevalni sistem Razvoj slovenščine v digitalnem okolju Trenutna različica korpusa KOST STR...
Predstavitev segmentacije
V tem poglavju je strnjeno predstavljena stavčna segmentacija. Glavno vodilo za razmejevanje povedi je kombinacija končnega ločila, presledka in besede, zapisane z veliko začetnico. Temu se pridružujejo dodatna pravila, ki zajemajo okrajšave. Te se namreč zapi...
Reference in povezave
V tem poglavju so zbrane relevantne reference in povezave na projekte, v katerih se je postopek segmentacije razvijal in uporabljal. Projekti, na katerih se je označevalni sistem razvijal oz. uporabljal JOS - Jezikoslovno označevanje slovenskega jezika: metode...
Annotation Guidelines
This chapter summarizes the annotation guidelines for tokenization. ⬥ Space is the principal separator for tokens. ⬥ Sequences of words that can be written both with or without space without changing its meaning (e.g. kdorkoli, kdor koli “anybody, any body”) f...
Introduction to Normalization
This chapter summarizes the process of normalizing non-standard Slovene words. A more detailed presentation can be found in the guidelines in the Annotation Guidelines chapter. In the case of Slovene tweets, normalization was carried out simultaneously with to...
Annotation Guidelines
This chapter summarizes the annotation guidelines for normalization of Slovene non-standard texts. The guidelines are arranged from the latest, up-to-date version to the oldest version. Version 2.0 Project Development of Slovene in a Digital Environment LENARD...
References and Links
This chapter compiles relevant references and provides links to projects where the normalization process has been developed and applied to Slovene texts. Projects, in which normalization has been developed or applied Development of Slovene in a Digital Environ...
Introduction to Tags
In this chapter, we outline the design of the MULTEXT-East specifications. The multilingual MULTEXT-East specifications are written in XML, following the TEI recommendations, and define the morphosyntactic features (attributes and their values) of words, i.e. ...
Annotation Guidelines
This chapter summarizes the annotation guidelines for the MULTEXT-East morphosyntax as applied to Slovene texts. The guidelines are arranged from the latest, up-to-date version to the oldest version. Version 2.0 (25-02-2023) Project Development of Slovene in a...
References and Links
This chapter compiles relevant references and provides links to projects where the MULTEXT-East morphosyntax has been developed and applied to Slovene texts. Projects, in which the system has been developed or applied MULTEXT-East - Multilingual corpora and te...
Annotation Guidelines
This chapter summarizes the annotation guidelines for the lemmatization of Slovene texts. The guidelines are arranged from the latest, up-to-date version to the oldest version. Version 2.0 (25-02-2023) Project Development of Slovene in a Digital Environment HO...
References and Links
This chapter compiles relevant references and provides links to projects where the lemmatization of Slovene has been developed and applied to Slovene texts. Projects, in which the system has been developed: JOS - Linguistic Annotation of Slovene: Methods and R...
Introduction to Tags
This chapter summarises the JOS-SYN syntax tags. A more detailed presentation can be found in the guidelines in the Annotation Guidelines chapter. Tag Description Atr (Attribute) Atr is used to link heads and their dependents in word phrases. The source...
Annotation Guidelines
This chapter summarizes the annotation guidelines for the JOS-SYN syntax as applied to Slovene texts. The guidelines are arranged from the latest, up-to-date version to the oldest version. Version 2.0 (02-2023) Project Development of Slovene in a Digital Envir...
References and Links
This chapter compiles relevant references and provides links to projects where the JOS-SYN syntax has been developed and applied to Slovene texts. Projects, in which the system has been developed or applied JOS - Linguistic Annotation of Slovene: Methods and R...
Introduction to Tags
The Universal Dependencies framework establishes a comprehensive and universal set of tags for parts of speech (POS), morphological features and syntactic dependencies that can be adopted in the treebanks of individual languages, or supplemented with new morph...
Annotation Guidelines
This chapter summarizes the annotation guidelines for the Universal Dependencies (UD) morphology and syntax as applied to Slovene texts. The guidelines are arranged from the latest, up-to-date version to the oldest version. Version 1.7 Project SPOT DOBROVOLJC...
References and Links
This chapter compiles relevant references and provides links to projects where the the Universal Dependencies (UD) morphology and syntax have been developed and applied to Slovene texts. Main website of the Universal Dependencies project: https://universaldepe...
Introduction to Labels
This chapter summarises labels for named entities (NEs). A more detailed presentation can be found in the guidelines in the Annotation Guidelines chapter. Category Subcategory Examples Doesn't belong in the category PER some white text Person (name an...