Crowdsourcing

In accordance with the trends in modern lexicography, the Centre for Language Resources and Technologies of the University of Ljubljana includes user feedback in in the development of language resources. Aside from conducting user needs studies to from the basis for user-friendly interfaces, we also develop language resources with the use of crowdsourcing, which allows users to solve microtasks and use their knowledge to contribute towards a quicker development of language resources. We have implemented user contribution methods in a number of our resources, such as the Thesaurus of Modern Slovene, which enables users to add new synonym candidates or rate existing ones; the Collocations Dictionary of Modern Slovene, which allows users to rate collocations; and the Sloleks Morphological Lexicon of Slovene, which allows users to add or rate pronunciation recordings and rate accented word forms.

We also implement crowdsourcing in the development of resources through gamification – combining crowdsourcing microtasks with a gaming style that is useful, educational and entertaining for users. In A Game of Words, for instance, users can compete with each other in finding connections between words that typically co-occur, thereby contributing to the improvement of the database of the Collocations Dictionary of Modern Slovene.

For certain linguistic tasks, we also use the PyBossa crowdsourcing platform. Our local installation of PyBossa has already been used for a number of projects, e.g. sorting examples-of-use in correct headwords meanings for the Collocations Dictionary of Slovene, correcting automatically generated accented word forms in theSloleks Morphological Lexicon of Slovene, and annotating socially unacceptable discourse in online comments within the framework of the FRENK project.

LINKS AND CONTACT

Centre for Language Resources and Technologies, University of Ljubljana
Večna pot 113, SI-1000 Ljubljana, Slovenia