Error annotation

24% of the texts in the KOST corpus are annotated with error tags, thus annotating linguistic errors made by learners writing in Slovene. The error annotation is done manually. Each error is assigned a corrected form and one of 23 error categories.

Error taxonomy

The basic error categories are listed together with an illustrative example from KOST (in Slovene).

Orthographical errors

Punctuation (Z-LOC): v zimskem času, večina ljudi uporablja avtomobile > v zimskem času večina ljudi uporablja avtomobile

Spelling (Z-CRK): bolše > boljše

Joined or divided words (Z-SN): ni sem > nisem

Capital letters (Z-MV): najpomembnejši praznik v moji državi je Božič > najpomembnejši praznik v moji državi je božič

Abbreviations (Z-KR): in dr. > idr.

Lexical errors

Noun (B-SAM): ni mi všeč kadiranje > ni mi všeč kajenje

Verb (B-GLAG): sem se zelo težko naučila na mir > sem se zelo težko navadila na mir

Adjective (B-PRID): sem družbena oseba sem družabna oseba

Pronoun (B-ZAIM): onidva onadva

Adverb (B-PRISL): ko pride domako pride domov

Preposition (B-PRED): sa prijateljico prijateljico

Conjunction (B-VEZ): kdaj sem obiskal Turčijo > ko sem obiskal Turčijo

Other (B-OST): petindvajest petindvajset

Morphological errors

Noun (O-SAM): nagajajo pticami > nagajajo pticam

Verb (O-GLAG): iškem > iščem

Adjective (O-PRID): najglavnejša > najbolj glavna

Pronoun (O-ZAIM): v nama > v nas

Adverb (O-PRISL): pomaga boljši poznati materni jezik > pomaga bolje poznati materni jezik

Other (O-OST): štirje predavanja > štiri predavanja

Syntactical errors

Structure (S-STR): rada bi da živim sama > rada bi živela sama

Word order (S-BR): zdi mi se > zdi se mi

Omission (S-IZP): ki sem jedla > ki sem ga jedla

Insertion (S-ODV):  ne bova bila > ne bova

Related corrections

Here, corrections have to be made after something else in the context has been corrected: z mojim fantom > s svojim fantom

Attention!

As error annotation is to some extent subjective, this must be taken into account when analysing the results.

Error annotation app

For greater accuracy in error annotation, there is an Error tagging manual (Slovene version only) that all annotators follow. The manual is updated as necessary; it was last updated in October 2023.

Aplikacija za označevanje napak

The errors are annotated in Svala, a specially developed application.