Thesaurus of Modern Slovene

Sopomenke 1.0


With the Thesaurus of Modern Slovene, we are introducing a new type of dictionary called the responsive dictionary. The initial database of a responsive dictionary is constructed using advanced computational methods, instantly providing the language community with a large amount of relevant, albeit still somewhat noisy language information. A responsive dictionary is characterized by two more key traits: first, its database is openly accessible, and second, it provides a number of ways for the language community to improve the database and clean up noisy elements. This means that the construction of a responsive dictionary is never truly concluded as its data constantly evolves in accordance with changes in the modern language. All changes can be tracked using timestamps in individual entries, while the different versions of the database are stored in a dedicated archive. The responsive dictionary takes its name from the fact that the approach to its construction allows the data to continuously respond to the opinions of the contributing language community and the changes in language originating from text produced by the language community. Essentially, it is “a dictionary made by the community for the community”.

The Thesaurus of Modern Slovene is based on the data contained in two principal language resources: The Oxford®-DZS Comprehensive English-Slovenian Dictionary and the Gigafida reference corpus of written Slovene. Both resources contain language material created after 1991 and as such offer a description of modern Slovene. The links identified between synonyms were additionally confirmed using the older Dictionary of Standard Slovenian Language (SSKJ). The data extraction and structure for the Thesaurus were based on the frequency and manner in which words co-occur in translation strings of the Oxford-DZS Dictionary. This information is the basis for discriminating between ‘core’ and ‘near’ synonyms, with ‘core’ synonyms exhibiting a greater degree of connection to the keyword. In the following step, an approach combining balanced co-occurrence graphs and the Personal PageRank algorithm automatically divides the synonyms into subgroups and ranks them according to the degree of semantic relatedness to the keyword, as well as their frequency in language use. Co-occurrence graphs are used to organize synonyms in the dictionary. For a more detailed description of this methodology, see Krek et al. (2017).


By continuing to browse the site, you are agreeing to our use of cookies. More Information >

More information: COOKIE POLICY

Our website uses “cookies” to distinguish between visitors and to perform website statistics usage. This allows us to improve the page constantly. Users who do not allow our website "cookies" to be recorded on their computer, will not be able to use all the functionalities of the website (video, comment on Facebook, etc.).Cookies are small files that a website that you visited records on your computer. The next time you are visiting the same site, the system can recognize you.

Our website uses the following types of cookies:

First-Party Cookies

PHPSESSID: this cookie is used for managing user session on the website. Session cookies: are used for temporary storage of information.

wordpress_test_cookie: A session cookie, deleted when you close your web browser.

_icl_current_language: WPML cookie, stores selected language version of the page. Expires in 24 hours.

Third-Party Cookies

datr: Facebook tracking cookie. Lifespan: 2 years.

fr: Facebook advertising cookie. Lifespan: 3 months.

reg_fb_gate: session cookie

reg_fb_ref: session cookie

Google Map (SID - expires after 2 years, SAPISID - expires after 2 years, APISID - expires after 2 years, SSID - expires after 2 years, HSID - expires after 2 years, NID - expires after 6 months, PREF - expires after 8 months): are used to follow the number of users and to track their behavior on Google Maps.

Hide Information