{"id":962,"date":"2020-03-31T22:48:56","date_gmt":"2020-03-31T20:48:56","guid":{"rendered":"https:\/\/www.cjvt.starkmat.si\/template-projekt\/work-packages\/work-package-3\/"},"modified":"2024-01-25T12:20:54","modified_gmt":"2024-01-25T11:20:54","slug":"work-package-3","status":"publish","type":"page","link":"https:\/\/www.cjvt.si\/povejmo\/en\/work-packages\/vemo-digi\/","title":{"rendered":"Work Package 3: VeMo-Digi"},"content":{"rendered":"<div class=\"flex_column av_one_full  no_margin flex_column_div av-zero-column-padding first  avia-builder-el-0  el_before_av_one_full  avia-builder-el-first  \" style='margin-top:0px; margin-bottom:30px; border-radius:0px; '><section class=\"av_textblock_section \"  itemscope=\"itemscope\" itemtype=\"https:\/\/schema.org\/CreativeWork\" ><div class='avia_textblock  '   itemprop=\"text\" ><h2>Large language models for advanced domain-specific digitization<\/h2>\n<\/div><\/section><br \/>\n<section class=\"av_textblock_section \"  itemscope=\"itemscope\" itemtype=\"https:\/\/schema.org\/CreativeWork\" ><div class='avia_textblock  '   itemprop=\"text\" ><h3><span class=\"TextRun SCXW134244926 BCX0\" lang=\"SL-SI\" xml:lang=\"SL-SI\" data-contrast=\"auto\"><span class=\"NormalTextRun SpellingErrorV2Themed SCXW134244926 BCX0\">VeMo-Digi<\/span><\/span><\/h3>\n<\/div><\/section><\/p><\/div>\n<div class=\"flex_column av_one_full  no_margin flex_column_div av-zero-column-padding first  avia-builder-el-3  el_after_av_one_full  el_before_av_one_full  column-top-margin\" style='margin-top:0px; margin-bottom:30px; border-radius:0px; '><section class=\"av_textblock_section \"  itemscope=\"itemscope\" itemtype=\"https:\/\/schema.org\/CreativeWork\" ><div class='avia_textblock  '   itemprop=\"text\" ><p>The project will apply language models to digitalization across various domains. Semantika d.o.o. already has a developed range of solutions marketed internationally (e.g., the Museums platform), with one of the significant challenges being the preparation of multilingual materials for presentation materials. By developing a machine translator and subsequently integrating it into products, Slovenian users will have an easier reach to an international audience for the company&#8217;s product portfolio, achieving a multiplicative effect in results. These solutions can also be expanded to other less represented languages in the markets (e.g., Bosnia and Herzegovina).<\/p>\n<\/div><\/section><\/div>\n<div class=\"flex_column av_one_full  no_margin flex_column_div av-zero-column-padding first  avia-builder-el-5  el_after_av_one_full  el_before_av_one_full  column-top-margin\" style='margin-top:0px; margin-bottom:30px; border-radius:0px; '><section class=\"av_textblock_section \"  itemscope=\"itemscope\" itemtype=\"https:\/\/schema.org\/CreativeWork\" ><div class='avia_textblock  '   itemprop=\"text\" ><h3>Specific objectives:<\/h3>\n<ol>\n<li>Create a command or dialog dataset with dialogues and instructions from the field of humanities.<\/li>\n<li>Upgrade and expand the existing infrastructure for digitizing museum collections and archival materials using optical character recognition enhanced with context-sensitive spelling with the SloLLaMai model.<\/li>\n<li>Demonstration of a semantic search engine with natural language communication support, implementing it in the case of an internal document search engine with a PoC in digital humanities and highly regulated industries.<\/li>\n<li>Demonstration of automatic generation, editing, and optimization of texts for online publication for museum documentation and materials.<\/li>\n<li>Demonstration of automatic searching, extraction, and summarization of relevant information from freely available sources based on a provided topic (e.g., preparing a short description of the work&#8217;s author based on Wikipedia).<\/li>\n<li>Demonstration of machine interpretation of sequences of instructions in natural language for working with an application.<\/li>\n<li>Demonstration of machine translation of texts using large language models between Slovenian and a) other languages and b) older forms of Slovenian, with translation of archival materials from INZ into contemporary Slovenian.<\/li>\n<li>Demonstration of a speech interface using the example of an electronic guide.<\/li>\n<li>Demonstration of solutions for automatic entity extraction, such as people, places, etc., using large language models, and a demonstration of log anonymization based on them.<\/li>\n<\/ol>\n<\/div><\/section><\/div><div class=\"flex_column av_one_full  no_margin flex_column_div av-zero-column-padding first  avia-builder-el-7  el_after_av_one_full  el_before_av_one_full  column-top-margin\" style='margin-top:0px; margin-bottom:30px; border-radius:0px; '><section class=\"av_textblock_section \"  itemscope=\"itemscope\" itemtype=\"https:\/\/schema.org\/CreativeWork\" ><div class='avia_textblock  '   itemprop=\"text\" ><h3>Results:<\/h3>\n<ul>\n<li style=\"list-style-type: none;\"><\/li>\n<li><strong>D3.1:<\/strong> Training dataset with at least 10,000 examples (August 2023).<\/li>\n<li><strong>D3.2:<\/strong> Calibrated SloLLaMai models for humanities and instruction tracking (February 2025).<\/li>\n<li><strong>D3.3:<\/strong> Demonstration application for OCR (August 2025).<\/li>\n<li><strong>D3.4:<\/strong> Demonstration application for semantic search (August 2025).<\/li>\n<li><strong> D3.5:<\/strong> Demonstration application for automatic generation of collection descriptions (February 2026).<\/li>\n<li><strong>D3.6:<\/strong> Demonstration application for summarizer (February 2026).<\/li>\n<li><strong>D3.7:<\/strong> Demonstration application for translator (February 2026).<\/li>\n<li><strong>D3.8:<\/strong> Demonstration application for translating between natural language instructions and command language (February 2026).<\/li>\n<li><strong>D3.9:<\/strong> Development of an application for machine entity extraction and document anonymization, with a demonstration on Semantika&#8217;s datasets (February 2026).<\/li>\n<\/ul>\n<\/div><\/section><\/div><\/p>\n<div class=\"flex_column av_one_full  no_margin flex_column_div av-zero-column-padding first  avia-builder-el-9  el_after_av_one_full  el_before_av_one_fourth  column-top-margin\" style='margin-top:0px; margin-bottom:30px; border-radius:0px; '><section class=\"av_textblock_section \"  itemscope=\"itemscope\" itemtype=\"https:\/\/schema.org\/CreativeWork\" ><div class='avia_textblock  '   itemprop=\"text\" ><h3>Project partners:<\/h3>\n<\/div><\/section><\/div>\n<div class=\"flex_column av_one_fourth  flex_column_div av-zero-column-padding first  avia-builder-el-11  el_after_av_one_full  el_before_av_one_fourth  column-top-margin\" style='border-radius:0px; '><section class=\"av_textblock_section \"  itemscope=\"itemscope\" itemtype=\"https:\/\/schema.org\/CreativeWork\" ><div class='avia_textblock  '   itemprop=\"text\" ><h5>Project leader:<\/h5>\n<\/div><\/section><\/div>\n<div class=\"flex_column av_one_fourth  flex_column_div av-zero-column-padding   avia-builder-el-13  el_after_av_one_fourth  el_before_av_one_fourth  column-top-margin\" style='border-radius:0px; '><div  class='avia-button-wrap avia-button-center  avia-builder-el-14  avia-builder-el-no-sibling ' ><a href='https:\/\/semantika.eu\/en-us\/' class='avia-button avia-button-fullwidth   avia-icon_select-no avia-color-theme-color '  style='color:#ffffff; ' ><span class='avia_iconbox_title' >Semantika d.o.o.<\/span><span class='avia_button_background avia-button avia-button-fullwidth avia-color-theme-color-highlight' ><\/span><\/a><\/div><\/div>\n<div class=\"flex_column av_one_fourth  flex_column_div av-zero-column-padding   avia-builder-el-15  el_after_av_one_fourth  el_before_av_one_fourth  column-top-margin\" style='border-radius:0px; '><\/div>\n<div class=\"flex_column av_one_fourth  flex_column_div av-zero-column-padding   avia-builder-el-16  el_after_av_one_fourth  el_before_av_one_fourth  column-top-margin\" style='border-radius:0px; '><\/div>\n<div class=\"flex_column av_one_fourth  flex_column_div av-zero-column-padding first  avia-builder-el-17  el_after_av_one_fourth  el_before_av_one_fourth  column-top-margin\" style='border-radius:0px; '><section class=\"av_textblock_section \"  itemscope=\"itemscope\" itemtype=\"https:\/\/schema.org\/CreativeWork\" ><div class='avia_textblock  '   itemprop=\"text\" ><h5>Partners:<\/h5>\n<\/div><\/section><\/div>\n<div class=\"flex_column av_one_fourth  flex_column_div av-zero-column-padding   avia-builder-el-19  el_after_av_one_fourth  el_before_av_one_fourth  column-top-margin\" style='border-radius:0px; '><div  class='avia-button-wrap avia-button-center  avia-builder-el-20  avia-builder-el-no-sibling ' ><a href='https:\/\/www.fri.uni-lj.si\/en' class='avia-button avia-button-fullwidth   avia-icon_select-no avia-color-theme-color '  style='color:#ffffff; ' ><span class='avia_iconbox_title' >Faculty of Computer and Information Science UL<\/span><span class='avia_button_background avia-button avia-button-fullwidth avia-color-theme-color-highlight' ><\/span><\/a><\/div><\/div>\n<div class=\"flex_column av_one_fourth  flex_column_div av-zero-column-padding   avia-builder-el-21  el_after_av_one_fourth  avia-builder-el-last  column-top-margin\" style='border-radius:0px; '><div  class='avia-button-wrap avia-button-center  avia-builder-el-22  avia-builder-el-no-sibling ' ><a href='https:\/\/www.inz.si\/en\/' class='avia-button avia-button-fullwidth   avia-icon_select-no avia-color-theme-color '  style='color:#ffffff; ' ><span class='avia_iconbox_title' >Institute of Contemporary History<\/span><span class='avia_button_background avia-button avia-button-fullwidth avia-color-theme-color-highlight' ><\/span><\/a><\/div><\/div>\n","protected":false},"excerpt":{"rendered":"","protected":false},"author":1,"featured_media":0,"parent":953,"menu_order":0,"comment_status":"closed","ping_status":"closed","template":"","meta":{"_acf_changed":false,"_relevanssi_hide_post":"","_relevanssi_hide_content":"","_relevanssi_pin_for_all":"","_relevanssi_pin_keywords":"","_relevanssi_unpin_keywords":"","_relevanssi_related_keywords":"","_relevanssi_related_include_ids":"","_relevanssi_related_exclude_ids":"","_relevanssi_related_no_append":"","_relevanssi_related_not_related":"","_relevanssi_related_posts":"","_relevanssi_noindex_reason":"","inline_featured_image":false,"episode_type":"","audio_file":"","cover_image":"","cover_image_id":"","duration":"","filesize":"","filesize_raw":"","date_recorded":"","explicit":"","block":"","itunes_episode_number":"","itunes_title":"","itunes_season_number":"","itunes_episode_type":"","footnotes":""},"class_list":["post-962","page","type-page","status-publish","hentry"],"acf":[],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v26.9 - https:\/\/yoast.com\/product\/yoast-seo-wordpress\/ -->\n<title>Work Package 3: VeMo-Digi - PoVeJMo<\/title>\n<meta name=\"robots\" content=\"noindex, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Work Package 3: VeMo-Digi - PoVeJMo\" \/>\n<meta property=\"og:url\" content=\"https:\/\/www.cjvt.si\/povejmo\/en\/work-packages\/vemo-digi\/\" \/>\n<meta property=\"og:site_name\" content=\"PoVeJMo\" \/>\n<meta property=\"article:modified_time\" content=\"2024-01-25T11:20:54+00:00\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:label1\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data1\" content=\"12 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"WebPage\",\"@id\":\"https:\/\/www.cjvt.si\/povejmo\/en\/work-packages\/vemo-digi\/\",\"url\":\"https:\/\/www.cjvt.si\/povejmo\/en\/work-packages\/vemo-digi\/\",\"name\":\"Work Package 3: VeMo-Digi - PoVeJMo\",\"isPartOf\":{\"@id\":\"https:\/\/www.cjvt.si\/povejmo\/en\/#website\"},\"datePublished\":\"2020-03-31T20:48:56+00:00\",\"dateModified\":\"2024-01-25T11:20:54+00:00\",\"breadcrumb\":{\"@id\":\"https:\/\/www.cjvt.si\/povejmo\/en\/work-packages\/vemo-digi\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/www.cjvt.si\/povejmo\/en\/work-packages\/vemo-digi\/\"]}]},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/www.cjvt.si\/povejmo\/en\/work-packages\/vemo-digi\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/www.cjvt.si\/povejmo\/o-programu\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Work Packages\",\"item\":\"https:\/\/www.cjvt.si\/povejmo\/en\/work-packages\/\"},{\"@type\":\"ListItem\",\"position\":3,\"name\":\"Work Package 3\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/www.cjvt.si\/povejmo\/en\/#website\",\"url\":\"https:\/\/www.cjvt.si\/povejmo\/en\/\",\"name\":\"PoVeJMo\",\"description\":\"Work site\",\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/www.cjvt.si\/povejmo\/en\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Work Package 3: VeMo-Digi - PoVeJMo","robots":{"index":"noindex","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"og_locale":"en_US","og_type":"article","og_title":"Work Package 3: VeMo-Digi - PoVeJMo","og_url":"https:\/\/www.cjvt.si\/povejmo\/en\/work-packages\/vemo-digi\/","og_site_name":"PoVeJMo","article_modified_time":"2024-01-25T11:20:54+00:00","twitter_card":"summary_large_image","twitter_misc":{"Est. reading time":"12 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"WebPage","@id":"https:\/\/www.cjvt.si\/povejmo\/en\/work-packages\/vemo-digi\/","url":"https:\/\/www.cjvt.si\/povejmo\/en\/work-packages\/vemo-digi\/","name":"Work Package 3: VeMo-Digi - PoVeJMo","isPartOf":{"@id":"https:\/\/www.cjvt.si\/povejmo\/en\/#website"},"datePublished":"2020-03-31T20:48:56+00:00","dateModified":"2024-01-25T11:20:54+00:00","breadcrumb":{"@id":"https:\/\/www.cjvt.si\/povejmo\/en\/work-packages\/vemo-digi\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/www.cjvt.si\/povejmo\/en\/work-packages\/vemo-digi\/"]}]},{"@type":"BreadcrumbList","@id":"https:\/\/www.cjvt.si\/povejmo\/en\/work-packages\/vemo-digi\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/www.cjvt.si\/povejmo\/o-programu\/"},{"@type":"ListItem","position":2,"name":"Work Packages","item":"https:\/\/www.cjvt.si\/povejmo\/en\/work-packages\/"},{"@type":"ListItem","position":3,"name":"Work Package 3"}]},{"@type":"WebSite","@id":"https:\/\/www.cjvt.si\/povejmo\/en\/#website","url":"https:\/\/www.cjvt.si\/povejmo\/en\/","name":"PoVeJMo","description":"Work site","potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/www.cjvt.si\/povejmo\/en\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"}]}},"_links":{"self":[{"href":"https:\/\/www.cjvt.si\/povejmo\/en\/wp-json\/wp\/v2\/pages\/962","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.cjvt.si\/povejmo\/en\/wp-json\/wp\/v2\/pages"}],"about":[{"href":"https:\/\/www.cjvt.si\/povejmo\/en\/wp-json\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"https:\/\/www.cjvt.si\/povejmo\/en\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.cjvt.si\/povejmo\/en\/wp-json\/wp\/v2\/comments?post=962"}],"version-history":[{"count":7,"href":"https:\/\/www.cjvt.si\/povejmo\/en\/wp-json\/wp\/v2\/pages\/962\/revisions"}],"predecessor-version":[{"id":1625,"href":"https:\/\/www.cjvt.si\/povejmo\/en\/wp-json\/wp\/v2\/pages\/962\/revisions\/1625"}],"up":[{"embeddable":true,"href":"https:\/\/www.cjvt.si\/povejmo\/en\/wp-json\/wp\/v2\/pages\/953"}],"wp:attachment":[{"href":"https:\/\/www.cjvt.si\/povejmo\/en\/wp-json\/wp\/v2\/media?parent=962"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}