Publications 2026

This list includes all the publications which were created in the context of the LLM4DH project in 2026.

Book chapters

Journal articles

  • Ulčar, M., Žagar, A., Armendariz, C.S., Repar, A., Pollak, S., Purver, M., and Robnik Šikonja, M. (2026). Mono- and cross-lingual evaluation of representation language models on less-resourced languages, Computer Speech & Language, 95, 101852. https://doi.org/10.1016/j.csl.2025.101852
  • Verdonik, D., and Vidinić, J. (2026). Dialoška dejanja v zasebni govorni interakciji / Dialogue acts in private spoken interaction. Slavistična revija, 74(1), 77–96. https://srl.si/ojs/srl/article/view/4295

Conference papers

  • Knez, T. and Žitnik, S. (2026). Improving Slovene Language Models for Lexicographic Question Answering through Continued Pretraining and Instruction Fine-Tuning. Proceedings of the Workshop on Structured Linguistic Data and Evaluation, pp. 114–123. https://www.slide-workshop.org/book.pdf#page=128

Datasets

  • Arčon, T.; Klemen, M.; Robnik-Šikonja, M.; Dobrovoljc, K. and Terčon, L. (2026). A multilingual benchmark for evaluating metalinguistic knowledge WALS-Bench 1.0, Slovenian language resource repository CLARIN.SI.
  • Kuzman Pungeršek, T., Rupnik, P. and Ljubešić, N. (2026). South Slavic web corpus collection CLASSLA-web 2.0, Slovenian language resource repository CLARIN.SI. http://hdl.handle.net/11356/2079.
  • Terčon, L.; Dobrovoljc, K.; Klemen, M., Arčon, T. and Robnik-Šikonja, M. (2026). Corpus-grounded evaluation dataset for grammatical question answering GramQA 1.0, Slovenian language resource repository CLARIN.SI. http://hdl.handle.net/11356/2086.

Other

Large Language Models for Digital Humanities (2026). ContRAG: Contradiction-Aware Retrieval for Legal Texts [Large Language Model]. LLM4DH. https://github.com/clarinsi/LegalContradictionRAG