Results
List of Results by Projects and Months
*No. | Name | Organization | Type | TRL | Links | |
August 2024 |
||||||
D1.1 | Accessible Slovenian training dataset for dialogues and command requests. | FRI | data | 3 | Access: | |
D1.2 | Large language corpus for conversational language and addressed terminological areas – first version. | FRI | data | 3 | Access: | |
D1.3 | Validation corpus for large language models. | FRI | data | 3 | Access: | |
D2.1 | Open-access large generative language model tailored for dialogues and commands with a size of one billion parameters. | FRI | software | 3 | Access: | |
D3.1 | Training set with at least 10,000 examples. | Semantika | data | 5 | Access: | |
D4.1 | Training set with specific dialogues and commands from the field of medical applications, consisting of at least 10,000 examples. | Better | data (description) | 5 | Access: | |
D5.1 | Analysis of the possibilities of using speech and language technologies to improve the efficiency of human-machine communication in industrial environments. | Špica | documentation (description) | 5 | Access:
(in Slovene) |
|
D5.2 | Report on the suitability of technical equipment and possible methods of integrating language and speech technologies in manufacturing environments. | Špica | documentation (description) | 5 | Access:
(in Slovene) |
|
D6.1 | Development of the initial plan and approaches for the effective use of large language models in IaC (Infrastructure as Code): Use and selection of training data, description of possible approaches. | XLAB | documentation (description) | 5 | Access:
(in Slovene) |
|
D6.2 | First version of software enabling the use of language technologies in IaC, taking into account performance, robustness, and security limitations. | XLAB | software (description) | 5 | Access:
(in Slovene) |
|
February 2025 |
||||||
D1.4 | Tools for preparing lexical databases for model training and components for the HuggingFace pipeline for integrating open lexical forms. | FRI | software | 3 | ||
D1.5 | Dedicated tokenizers for the Slovenian language – first version. | FRI | software | 3 | ||
D2.2 | Open-access large generative language model tailored for dialogues and commands with a size of 10 billion parameters. | FRI | software | 3 | ||
D3.2 | Calibrated SloLLaMai models for the humanities and instruction tracking. | Semantika | software | 5 | ||
D4.2 | Large generative language model adapted for the field of medicine. | Better | software | 5 | ||
August 2025 |
||||||
D1.6 | Large language corpus for conversational language and addressed terminological areas – second version. | FRI | data | 4 | ||
D2.3 | Open-access computationally lightweight generative language model tailored for dialogues and commands. | FRI | software | 4 | ||
D3.3 | Demonstration application for OCR. | Semantika | software | 6 | ||
D3.4 | Demonstration application for semantic search. | Semantika | software | 6 | ||
D4.3 | Precise recognizer of Slovenian speech specialized for the field of medicine. | Better | software | 5 | ||
D5.3 | Online service for acoustic preprocessing of audio signals and noise reduction. | Špica | software | 5 | ||
D5.4 | Accurate and robust multilingual speech recognition model for South Slavic languages. | Špica | software | 5 | ||
D6.3 | Refinement of the plan and corrections to approaches for the effective use of large language models in IaC (Infrastructure as Code): Use and selection of training data, description of approaches in construction. | XLAB | documentation | 5 | ||
D6.4 | Second version of software enabling the use of language technologies in IaC, taking into account performance, robustness, and security limitations – achieved TRL5. | XLAB | software | 5 | ||
February 2026 |
||||||
D1.7 | Dedicated tokenizers for the Slovenian language – final version. | FRI | software | 4 | ||
D2.4 | Open-access large language model with embedded additional knowledge. | FRI | software | 3 | ||
D3.5 | Demonstration application for the automatic generation of collection descriptions. | Semantika | software | 6 | ||
D3.6 | Demonstration application for summarizer. | Semantika | software | 6 | ||
D3.7 | Demonstration application for a translator. | Semantika | software | 6 | ||
D3.8 | Demonstration application for translation between instructions in natural language and command language. | Semantika | software | 6 | ||
D3.9 | Development of an application for machine entity extraction and document anonymization, along with a demonstration on databases aquired by Semantika. | Semantika | software | 5 | ||
D4.4 | Medical application using a speech recognizer and a large generative language model. | Better | software | 6 | ||
June 2026 |
||||||
D1.8 | Large language corpus for conversational language and addressed terminological areas – final version. | FRI | data | 4 | ||
D1.9 | Knowledge base created based on the Digital Lexical Database. | FRI | data | 3 | ||
D3.10 | Demonstration of an upgraded digital guide. | Semantika | software | 6 | ||
D5.5 | Prototype integration of a speech communication system with a selected business process management solution in manufacturing. | Špica | software | 6 | ||
D6.5 | Final version of software enabling the use of language technologies in IaC, taking into account performance, robustness, and security limitations – achieved TRL6. | XLAB | software | 6 | ||