Skip to main content

03 Normalization

Computer-mediated communication (CMC) language significantly diverges from the standard language, posing challenges for current automatic text annotation tools. Normalization is essential for enhancing further text processing because it provides a standard equivalent for each non-standard occurrence. This step is critical as both lemmatization and morphosyntactic annotation of CMC language rely on these normalized forms (Čibej et al. 2016).