{"id":1178,"date":"2024-12-24T10:51:43","date_gmt":"2024-12-24T09:51:43","guid":{"rendered":"https:\/\/www.cjvt.si\/llm4dh\/?page_id=1178"},"modified":"2025-05-14T12:29:44","modified_gmt":"2025-05-14T10:29:44","slug":"challenge-5","status":"publish","type":"page","link":"https:\/\/www.cjvt.si\/llm4dh\/en\/work-packages\/work-package-5\/","title":{"rendered":"Challenge 5: Selected DH Challenges"},"content":{"rendered":"

Challenge 5: Selected DH Challenges <\/strong><\/h1>\n<\/div><\/section><\/div>\n

Digital humanities is a broad research area that incorporates many humanities and social sciences disciplines. In this challenge, we address three selected, highly impactful challenges.<\/p>\n<\/div><\/section><\/div>\n<\/div><\/div><\/div><\/div><\/div>

Task 5.1<\/span><\/span><\/span><\/span><\/a>Task 5.2<\/span><\/span><\/span><\/span><\/a>Task 5.3<\/span><\/span><\/span><\/span><\/a>Yearly reports<\/span><\/span><\/span><\/span><\/a><\/div>
<\/span><\/span>\n

T5.1 LLMs for historiography <\/em><\/strong><\/h3>\n<\/div><\/section>
\n
\n
\n

DH research on the discourse in large language corpora has traditionally relied on unsupervised text classification techniques such as topic modeling. However, many widely used techniques are prone to overfitting and are unstable. This is especially problematic for uncovering the latent features of discourse, such as its ideological underpinnings, which require complex linguistic evidence. To address this challenge, we aim to apply LLM-generated knowledge combined with named-entity graphs. As such graphs are constituted of interconnected sets of dynamic relations between (named) entities (Hogan et al. 2021), they can be used effectively for the integration and conceptualization of underlying discursive phenomena such as ideologies (van Dijk 2017). We aim to build LLM-driven knowledge graphs for the critical-discursive analysis and create historical identities from Slovenian historical newspapers that served as key instruments of political, social, and institutional powers (van Dijk 2013). This will enable diachronic analysis of ideological changes and attendant semantic lexical shifts in historical newspaper discourse.<\/p>\n<\/div>\n

First, we will use named-entity graphs from T4.1 to explore the relationships between people, places, and organizations in sPeriodika 1.0, a corpus of Slovenian historical periodicals (1771\u20131914). We will apply a mixed methods approach to analyze the named entity graphs, combining quantitative network analysis with critical discourse analysis. The investigation will focus on the emergence and development of intertwined historical identities: national, language, political, socio-economic, and religious. Second, as relations between historical identities can undergo semantic shifts through time, we will use diachronic analysis from T4.2 to study the attitudes, prejudices, and ideologies of social elites in different time periods. We will use the graphs as proxies for dynamically changing identities to investigate the perpetuation of language ideologies and investigate how such ideologies were historically tied to other identity-making aspects of individuals and places. We will also investigate diachronic semantic shifts of the lexical inventory related to the rise and fall of historical nationalisms, focusing on concepts of nationhood. We will investigate 1) oppositions between Pan-Slavic, Yugoslavian, and Slovenian identities, 2) how such notions were related to the major centers of power of the time (e.g., the Habsburg Empire until the early 20th century), and 3) how such identities incorporate smaller regional ones.<\/p>\n<\/section>\n

\n
<\/div>\n<\/section>\n<\/div><\/section>
\n