Deliverables
List of deliverables by work package and month
Task number | Deliverable | Type | Deliverable link |
January 2025 | |||
3.1.1 | Online interface for collecting conversational speech data (M4). | Application | Access: |
7.3.1 | Dissemination and Communication Plan (M4). | Report | Access: |
March 2025 | |||
1.1.1 | DDDS and OSWN datasets ready for training (M6). | Dataset | Access: |
7.2.1 | Data Management Plan (M6). | Report | Access: |
April 2025 | |||
3.1.2 | Manually transcribed conversation data for the spoken learning corpus (5 hours) (M7). | Dataset | Access: |
7.2.2 | Code of ethics, risk monitoring activities (M1-M36). | Report | Access:
PDF-file (in Slovene) |
July 2025 | |||
3.2.1 | Expanded learning spoken corpus with dialogue act and sentiment annotations (min. 5 hours of conversational speech) (M10). | Dataset | |
September 2025 | |||
1.1.2 | Initial improved LLM (M12). | Model | |
1.3.1 | Slovene datasets for training VLM (M12). | Dataset | |
2.2.1 | Synthetic language error datasets (M12). | Dataset | |
2.3.1 | LLM with improved grammatical knowledge (M12). | Model | |
4.1.1 | Interaction graphs of historical named entities (M12). | Other | |
6.1.1 | Metaphor, irony, and sarcasm benchmark in Slovene (M12). | Dataset | |
7.1.1 | Annual reports (M12). | Report | |
March 2026 | |||
1.2.1 | KGs and raw texts datasets (M18). | Dataset | |
2.2.2 | Grammar checking LLMs (M18). | Model | |
2.3.2 | Dataset for evaluating grammatical knowledge of LLMs (M18). | Dataset | |
4.4.1 | A new RAG system for Slovenian capable of detecting contradictions in documents (M18). | Application | |
5.1.1 | Novel methodological approaches to historical and ideological analysis using LLMs (M18). | Report | |
5.2.1 | Novel methodology for digital folkloristics (M18). | Report | |
5.3.1 | Database of Slovene legal texts (M18). | Database | |
September 2026 | |||
1.1.3 | Final improved LLM (M24). | Model | |
1.2.2 | Initial improved LLMs (M24). | Model | |
1.3.2 | Slovene VLM model (M24). | Model | |
2.1.1 | DDDS with generated lexicographic data – first version (M24). | Dataset | |
2.2.3 | Authentic grammar checking evaluation datasets (M24). | Dataset | |
4.1.2 | Visualization of extracted named entity graphs (M24). | Report | |
4.2.1 | Novel methodology for diachronic analysis using LLMs (M24). | Report | |
4.3.1 | Dataset of images from Slovene historical periodicals (M24). | Dataset | |
6.1.2 | Pragmatic and associative behavior explanation benchmark (M24). | Dataset | |
6.3.1 | Bias detection datasets for Slovene (M24). | Dataset | |
7.1.2 | Annual reports (M24). | Report | |
October 2026 | |||
3.1.3 | Manually multi-reference-transcribed data for the spoken benchmark corpus (1 hour new data + 3 hours of existing ASR data) (M25). | Dataset | |
March 2027 | |||
1.2.3 | Final improved LLMs (M30). | Model | |
3.2.2 | Models for dialogue act and sentiment identification in Slovenian speech (M30). | Model | |
4.3.2 | VLM adapted for selected DH tasks (M30). | Model | |
6.2.1 | Speech dataset (4 hours), annotated with dialogue act and sentiment annotations (M30). | Dataset | |
6.3.2 | Debiasing approach for LLMs (M30). | Report | |
September 2027 | |||
2.1.2 | DDDS with generated lexicographic data – final version (M36). | Other | |
2.3.3 | Multilingual and cross-lingual grammatical analyses (M36). | Report | |
3.1.4 | Audio speech database of conversation data (M36). | Database | |
3.3.1 | Slovenian audio speech database from publicly available resources (min. 300 hours) (M36). | Database | |
3.4.1 | A new ASR-LLM integration method for domain-specific ASR for low-resource languages (M36). | Program | |
5.1.2 | Novel analyses of ideological concepts through history (M36). | Report | |
5.2.2 | Novel analyses of conflict resolution rituals (M36). | Report | |
5.3.2 | An RAG-based system for Slovene legal support (M36). | Other | |
6.2.2 | Multi-reference ASR task, dialogue processing task, and sentiment in speech tasks (M36). | Report | |
6.3.3 | Spoken language bias detection analysis (M36). | Report | |
6.4.1 | A novel knowledge-based explanation methodology for LLM explanation (M36). | Report | |
7.1.3 | Annual reports (M36). | Report | |
7.3.2 | At least 40 conference/submitted journal publications (M36). | Report | |
7.3.3 | The above-mentioned activities (M36). | Other |