{"id":1163,"date":"2024-12-24T10:48:06","date_gmt":"2024-12-24T09:48:06","guid":{"rendered":"https:\/\/www.cjvt.si\/llm4dh\/?page_id=1163"},"modified":"2025-05-14T12:28:56","modified_gmt":"2025-05-14T10:28:56","slug":"challenge-3","status":"publish","type":"page","link":"https:\/\/www.cjvt.si\/llm4dh\/en\/work-packages\/work-package-3\/","title":{"rendered":"Challenge 3: LLMs for Spoken Language"},"content":{"rendered":"<div class=\"flex_column av_one_full  no_margin flex_column_div av-zero-column-padding first  avia-builder-el-0  el_before_av_one_full  avia-builder-el-first  \" style='margin-top:0px; margin-bottom:30px; border-radius:0px; '><section class=\"av_textblock_section \"  itemscope=\"itemscope\" itemtype=\"https:\/\/schema.org\/CreativeWork\" ><div class='avia_textblock  '   itemprop=\"text\" ><h1><strong>Challenge 3: LLMs for Spoken Language<\/strong><\/h1>\n<\/div><\/section><\/div>\n<div class=\"flex_column av_one_full  no_margin flex_column_div av-zero-column-padding first  avia-builder-el-2  el_after_av_one_full  el_before_av_tab_section  avia-builder-el-last  column-top-margin\" style='margin-top:0px; margin-bottom:30px; border-radius:0px; '><section class=\"av_textblock_section \"  itemscope=\"itemscope\" itemtype=\"https:\/\/schema.org\/CreativeWork\" ><div class='avia_textblock  '   itemprop=\"text\" ><p>LLMs are fundamentally dependent on data. In language use, substantial variations exist among different types of discourse. The majority of linguistic data derive from written resources. Extensive spoken data resources, if accessible for research, are typically owned by various companies or institutions and are rarely available to the research community. However, even when these data are publicly accessible, they differ significantly from the speech used in everyday conversations. Features such as looser sentence structures, interruptions, cross-talk, silences, repairs, repetitions, misarticulations, clarifications, backchannels, conflicts, slurs, and jargon (Yeomans et al., 2023) exemplify surface-level specifics of conversation and indicate that investigating such data has great impact potential in the linguistic research (Love et al. 2014) as well as poses significant challenges for artificial intelligence (Wahlster 2023). In this challenge, we aim to advance speech technologies and research through the newest methodologies in collecting, filtering, automatically transcribing, and pragmatically processing speech data with the help of LLMs.<\/p>\n<\/div><\/section><\/div>\n<\/div><\/div><\/div><!-- close content main div --><\/div><\/div><div id='av-tab-section-1'  class='av-tab-section-container entry-content-wrapper main_color av-tab-no-transition   av-tab-above-content  avia-builder-el-4  el_after_av_one_full  avia-builder-el-last  submenu-not-first container_wrap fullsize' style=' '  ><div class='av-tab-section-outer-container'><div class='av-tab-section-tab-title-container avia-tab-title-padding-default ' ><a href='#task-3.1' data-av-tab-section-title='1' class='av-section-tab-title av-active-tab-title no-scroll av-tab-no-icon av-tab-no-image  '><span class='av-outer-tab-title'><span class='av-inner-tab-title'>Task 3.1<\/span><\/span><span class='av-tab-arrow-container'><span><\/span><\/span><\/a><a href='#task-3.2' data-av-tab-section-title='2' class='av-section-tab-title  av-tab-no-icon av-tab-no-image  '><span class='av-outer-tab-title'><span class='av-inner-tab-title'>Task 3.2<\/span><\/span><span class='av-tab-arrow-container'><span><\/span><\/span><\/a><a href='#task-3.3' data-av-tab-section-title='3' class='av-section-tab-title  av-tab-no-icon av-tab-no-image  '><span class='av-outer-tab-title'><span class='av-inner-tab-title'>Task 3.3<\/span><\/span><span class='av-tab-arrow-container'><span><\/span><\/span><\/a><a href='#task-3.4' data-av-tab-section-title='4' class='av-section-tab-title  av-tab-no-icon av-tab-no-image  '><span class='av-outer-tab-title'><span class='av-inner-tab-title'>Task 3.4<\/span><\/span><span class='av-tab-arrow-container'><span><\/span><\/span><\/a><a href='#yearly-reports' data-av-tab-section-title='5' class='av-section-tab-title  av-tab-no-icon av-tab-no-image  '><span class='av-outer-tab-title'><span class='av-inner-tab-title'>Yearly reports<\/span><\/span><span class='av-tab-arrow-container'><span><\/span><\/span><\/a><\/div><div class='av-tab-section-inner-container avia-section-default' style='width:500vw; left:0%;'><span class='av_prev_tab_section av_tab_navigation'><\/span><span class='av_next_tab_section av_tab_navigation'><\/span>\n<div data-av-tab-section-content=\"1\" class=\"av-layout-tab av-animation-delay-container av-active-tab-content __av_init_open  avia-builder-el-5  el_before_av_tab_sub_section  avia-builder-el-first   \" style='vertical-align:middle; '  data-tab-section-id=\"task-3.1\"><div class='av-layout-tab-inner'><div class='container'><section class=\"av_textblock_section \"  itemscope=\"itemscope\" itemtype=\"https:\/\/schema.org\/CreativeWork\" ><div class='avia_textblock  '   itemprop=\"text\" ><h3><strong><em>T<\/em><\/strong><strong><em>3.1 Efficient spoken language data collection <\/em><\/strong><\/h3>\n<\/div><\/section><br \/>\n<section class=\"av_textblock_section \"  itemscope=\"itemscope\" itemtype=\"https:\/\/schema.org\/CreativeWork\" ><div class='avia_textblock  '   itemprop=\"text\" ><section class=\"av_textblock_section \">\n<div class=\"avia_textblock \">\n<p>Some conversational data in Slovene are available (Verdonik et al. 2024), yet they fall significantly short of the current state-of-the-art (Love et al. 2017). Thus, developing a modern and sustainable solution for conversational data collection is the first challenge we address. Collecting speech data remotely is essential for obtaining geographically dispersed data, especially for languages other than English (Parent &amp; Esk\u00e9nazi, 2011). However, the primary challenge lies in motivating citizens to donate speech. Little attention has been devoted to comprehending citizens\u2019 motivation for contributing speech data or to citizens\u2019 attitudes toward technologies based on LLMs. We aim to investigate citizens\u2019 perspectives on speech-enabled LLM technologies, identify their motivations, and establish an effective online approach for sustainable speech data acquisition.<\/p>\n<\/div>\n<p>We will use successful crowdsourcing platforms, Games-With-A-Purpose (GWAPs) and Collect4NLP, designing an online interface for collecting conversational speech data, taking into account economic, ethical, and legal aspects, while simplifying and automating metadata collection. Further, we will investigate user perspectives on speech-enabled LLM technologies and explore strategies to motivate speakers to contribute to speech databases, e.g., the importance of trust and privacy. We will apply self-determination theory to highlight the role of intrinsic motivations like contributing to scientific research, preserving language heritage, and social connection. The collected audio data, manually transcribed data, and the multi-reference transcription of the benchmarking part of the data will be publicly released.<\/p>\n<div class=\"avia_textblock \"><\/div>\n<\/section>\n<section class=\"av_textblock_section \">\n<div class=\"avia_textblock \"><\/div>\n<\/section>\n<\/div><\/section><br \/>\n<div class=\"flex_column av_one_fifth  flex_column_div av-zero-column-padding first  avia-builder-el-8  el_after_av_textblock  el_before_av_four_fifth  column-top-margin\" style='border-radius:0px; '><span  class=\"av_font_icon avia_animate_when_visible avia-icon-animate  av-icon-style-  av-no-color avia-icon-pos-left \" style=\"\"><span class='av-icon-char' style='font-size:40px;line-height:40px;' aria-hidden='true' data-av_icon='\ue810' data-av_iconfont='entypo-fontello' ><\/span><\/span><\/div><div class=\"flex_column av_four_fifth  flex_column_div av-zero-column-padding   avia-builder-el-10  el_after_av_one_fifth  avia-builder-el-last  column-top-margin\" style='border-radius:0px; '><section class=\"av_textblock_section \"  itemscope=\"itemscope\" itemtype=\"https:\/\/schema.org\/CreativeWork\" ><div class='avia_textblock  '   itemprop=\"text\" ><p><strong><em>Deliverables 3.1: Online interface for collecting conversational speech data (M4). Manually transcribed conversation data for the spoken learning corpus (5 hours) (M7). Manually multi-reference-transcribed data for the spoken benchmark corpus (1 hour new data + 3 hours of existing ASR data) (M25). Audio speech database of conversation data (M36). <\/em><\/strong><\/p>\n<\/div><\/section><\/div><\/p>\n<\/div><\/div><\/div><div data-av-tab-section-content=\"2\" class=\"av-layout-tab av-animation-delay-container   avia-builder-el-12  el_after_av_tab_sub_section  el_before_av_tab_sub_section   \" style='vertical-align:middle; '  data-tab-section-id=\"task-3.2\"><div class='av-layout-tab-inner'><div class='container'><section class=\"av_textblock_section \"  itemscope=\"itemscope\" itemtype=\"https:\/\/schema.org\/CreativeWork\" ><div class='avia_textblock  '   itemprop=\"text\" ><h3><strong><em>T3.2 Semantic and pragmatic speech processing <\/em><\/strong><\/h3>\n<\/div><\/section><br \/>\n<section class=\"av_textblock_section \"  itemscope=\"itemscope\" itemtype=\"https:\/\/schema.org\/CreativeWork\" ><div class='avia_textblock  '   itemprop=\"text\" ><section class=\"av_textblock_section \">\n<div class=\"avia_textblock \">\n<p>Humans interpret meaning in conversation across multiple dimensions. One critical level is the intent or function of an utterance. For example, a simple &#8220;Why?&#8221; can be interpreted as a request for information or an expression of hesitation, depending on the context. Speech act theory (Austin 1975) elucidated this level of meaning and became one of the most influential theories in pragmatics. However, it proved inadequate for authentic data (Levinson 2017). Consequently, researchers developed alternative interpretations under the term &#8216;dialogue act&#8217;. Describing speech use, particularly conversational speech, with robust, well-balanced, and appropriate annotations and interpretations of dialogue acts and related levels of meaning in context holds promise for a better understanding of how we communicate meaning, as well as for more efficient semantic and pragmatic processing of speech. With the advent of speech-based LLMs (Baevski et al. 2020; Radford et al. 2022) that perform both feature extraction, as well as long-range feature interaction, processing speech while taking into account pragmatic categories has become feasible (Miah et al. 2023).<\/p>\n<\/div>\n<\/section>\n<section class=\"av_textblock_section \">\n<div class=\"avia_textblock \">\n<p>Annotations of dialogue acts and related semantic levels, particularly the expression of sentiment, will be performed on Slovenian conversational data with a minimum duration of five hours. We will critically analyze the fundamental dialogue act categories and their relationships to other semantic levels, such as sentiment expression. We will automate dialogue act identification and sentiment identification from speech data, using speech-enabled LLMs pre-trained also on Slovenian, e.g., XLS-R, MMS, Whisper, and Seamless.<\/p>\n<\/div>\n<\/section>\n<\/div><\/section><br \/>\n<div class=\"flex_column av_one_fifth  flex_column_div av-zero-column-padding first  avia-builder-el-15  el_after_av_textblock  el_before_av_four_fifth  column-top-margin\" style='border-radius:0px; '><span  class=\"av_font_icon avia_animate_when_visible avia-icon-animate  av-icon-style-  av-no-color avia-icon-pos-left \" style=\"\"><span class='av-icon-char' style='font-size:40px;line-height:40px;' aria-hidden='true' data-av_icon='\ue810' data-av_iconfont='entypo-fontello' ><\/span><\/span><\/div><div class=\"flex_column av_four_fifth  flex_column_div av-zero-column-padding   avia-builder-el-17  el_after_av_one_fifth  avia-builder-el-last  column-top-margin\" style='border-radius:0px; '><section class=\"av_textblock_section \"  itemscope=\"itemscope\" itemtype=\"https:\/\/schema.org\/CreativeWork\" ><div class='avia_textblock  '   itemprop=\"text\" ><section class=\"av_textblock_section \">\n<div class=\"avia_textblock \"><\/div>\n<\/section>\n<section class=\"av_textblock_section \">\n<div class=\"avia_textblock \">\n<p><strong><em>Deliverables 3.2: Expanded learning spoken corpus with dialogue act and sentiment annotations (min. 5 hours of conversational speech) (M10). Models for dialogue act and sentiment identification in Slovenian speech (M30).<\/em><\/strong><\/p>\n<\/div>\n<\/section>\n<\/div><\/section><\/div><\/p>\n<\/div><\/div><\/div><div data-av-tab-section-content=\"3\" class=\"av-layout-tab av-animation-delay-container   avia-builder-el-19  el_after_av_tab_sub_section  el_before_av_tab_sub_section   \" style='vertical-align:middle; '  data-tab-section-id=\"task-3.3\"><div class='av-layout-tab-inner'><div class='container'><section class=\"av_textblock_section \"  itemscope=\"itemscope\" itemtype=\"https:\/\/schema.org\/CreativeWork\" ><div class='avia_textblock  '   itemprop=\"text\" ><h3><strong><em>T3.3 Speech data quality control and filtering <\/em><\/strong><\/h3>\n<\/div><\/section><br \/>\n<section class=\"av_textblock_section \"  itemscope=\"itemscope\" itemtype=\"https:\/\/schema.org\/CreativeWork\" ><div class='avia_textblock  '   itemprop=\"text\" ><section class=\"av_textblock_section \">\n<div class=\"avia_textblock \">\n<p>Speech data is crucial for developing LLMs and adapting them to recognize spoken language effectively (Prabhavalkar et al., 2023). However, collecting such data, especially for less-resource languages like Slovenian, poses significant challenges. A typical strategy involves sourcing recordings from various publicly accessible media and streaming platforms. These recordings often vary greatly in terms of speaker diversity, domains, dialects, and acoustic environments, and exhibit a broad range of quality levels. Using unfiltered data for large-scale model training or fine-tuning can negatively impact the models&#8217; accuracy and reliability. Moreover, in line with green ICT principles, which prioritize reducing energy consumption and greenhouse gas emissions (Georgescu et al., 2021), indiscriminate use of large datasets is environmentally concerning. Therefore, assessing the suitability of data prior to its deployment in training or fine-tuning processes is imperative to manage large speech datasets effectively. This step is also crucial for maintaining accuracy and minimizing biases in digital humanities research that relies on language technologies. We aim to develop speech data selection methods to enhance efficiency in speech data collection and analysis.<\/p>\n<p>Slovenian speech data will be acquired from publicly available resources, e.g., publicly available videos. These datasets encompass a diverse range of speakers from various demographic groups. The recordings may present challenges such as low quality, background noises, non-speech elements (e.g., silences, singing, music), overlapping speech, and foreign language content. We will protect speakers who are underage or belong to socially vulnerable groups. Initially, we will analyze the characteristics and attributes of the sample data. The identified categories will be utilized to (a) select objective speech\/audio metrics for quality control and (b) define attribute-based acoustic classifications of the speech recordings. Both components will be integrated into a pre-selection process for the collected speech recordings. The developed speech database will be evaluated using the ASR system with large general speech models.<\/p>\n<\/div>\n<\/section>\n<\/div><\/section><br \/>\n<div class=\"flex_column av_one_fifth  flex_column_div av-zero-column-padding first  avia-builder-el-22  el_after_av_textblock  el_before_av_four_fifth  column-top-margin\" style='border-radius:0px; '><span  class=\"av_font_icon avia_animate_when_visible avia-icon-animate  av-icon-style-  av-no-color avia-icon-pos-left \" style=\"\"><span class='av-icon-char' style='font-size:40px;line-height:40px;' aria-hidden='true' data-av_icon='\ue810' data-av_iconfont='entypo-fontello' ><\/span><\/span><\/div><div class=\"flex_column av_four_fifth  flex_column_div av-zero-column-padding   avia-builder-el-24  el_after_av_one_fifth  avia-builder-el-last  column-top-margin\" style='border-radius:0px; '><section class=\"av_textblock_section \"  itemscope=\"itemscope\" itemtype=\"https:\/\/schema.org\/CreativeWork\" ><div class='avia_textblock  '   itemprop=\"text\" ><section class=\"av_textblock_section \">\n<div class=\"avia_textblock \"><\/div>\n<\/section>\n<section class=\"av_textblock_section \">\n<div class=\"avia_textblock \">\n<p><strong><em>Deliverables 3.3: Slovenian audio speech database from publicly available resources (min. 300 hours) (M36)<\/em><\/strong><\/p>\n<\/div>\n<\/section>\n<\/div><\/section><\/div><\/p>\n<\/div><\/div><\/div><div data-av-tab-section-content=\"4\" class=\"av-layout-tab av-animation-delay-container   avia-builder-el-26  el_after_av_tab_sub_section  el_before_av_tab_sub_section   \" style='vertical-align:middle; '  data-tab-section-id=\"task-3.4\"><div class='av-layout-tab-inner'><div class='container'><section class=\"av_textblock_section \"  itemscope=\"itemscope\" itemtype=\"https:\/\/schema.org\/CreativeWork\" ><div class='avia_textblock  '   itemprop=\"text\" ><h3><strong><em>T<\/em><\/strong><strong><em>3.4 LLMs for domain-specific speech recognition <\/em><\/strong><\/h3>\n<\/div><\/section><br \/>\n<section class=\"av_textblock_section \"  itemscope=\"itemscope\" itemtype=\"https:\/\/schema.org\/CreativeWork\" ><div class='avia_textblock  '   itemprop=\"text\" ><section class=\"av_textblock_section \">\n<div class=\"avia_textblock \">\n<p>Highly accurate automatic speech recognition (ASR) remains challenging, especially in real-world domains with background noise, accented speech, code-switching between languages\/dialects, and domain-specific vocabularies (Hu et al, 2024). Traditional ASR models often struggle in such conditions. Integrating LLMs pre-trained on vast text data, can potentially improve ASR robustness by providing linguistic knowledge to constrain the transcription search space (Ma et al., 2024; Miao et al., 2022; Min &amp; Wang, 2023). However, effectively combining ASR and LLM components poses research challenges, specifically for less-resource languages that are under-represented in LLM pretraining data (Zhengdong et al, 2024; Hu et al, 2023b). The objective of this research task is to conduct a comprehensive comparative analysis to assess the efficacy of various techniques for integrating ASR and LLMs taking into account the limitations of less-resource languages. We aim to develop an innovative method for domain-specific ASR that will leverage the potential of ASR-LLM integration and will be optimized for less-resource languages.<\/p>\n<p>We will conduct a comprehensive study on integrating ASR and LLMs (covering shallow and deep fusion approaches) to measure the efficacy of the specific integration methods for ASR improvement and robustness. The focus will be on a) the identification of approaches that are best suited to less-resource languages given their limitations in terms of LLMs, and b) the usability of the particular ASR-LLM integration methods for domain\/context switching (in real-time) without the need for ASR model adaptation. Based on the results, we will develop a new method for domain-specific speech recognition that will leverage the potential of ASR-LLM integration for less-resourced languages.<\/p>\n<\/div>\n<\/section>\n<\/div><\/section><br \/>\n<div class=\"flex_column av_one_fifth  flex_column_div av-zero-column-padding first  avia-builder-el-29  el_after_av_textblock  el_before_av_four_fifth  column-top-margin\" style='border-radius:0px; '><span  class=\"av_font_icon avia_animate_when_visible avia-icon-animate  av-icon-style-  av-no-color avia-icon-pos-left \" style=\"\"><span class='av-icon-char' style='font-size:40px;line-height:40px;' aria-hidden='true' data-av_icon='\ue810' data-av_iconfont='entypo-fontello' ><\/span><\/span><\/div><div class=\"flex_column av_four_fifth  flex_column_div av-zero-column-padding   avia-builder-el-31  el_after_av_one_fifth  avia-builder-el-last  column-top-margin\" style='border-radius:0px; '><section class=\"av_textblock_section \"  itemscope=\"itemscope\" itemtype=\"https:\/\/schema.org\/CreativeWork\" ><div class='avia_textblock  '   itemprop=\"text\" ><section class=\"av_textblock_section \">\n<div class=\"avia_textblock \"><\/div>\n<\/section>\n<section class=\"av_textblock_section \">\n<div class=\"avia_textblock \">\n<p><strong><em>Deliverable 3.4: A new ASR-LLM integration method for domain-specific ASR for low-resource languages (M36)<\/em><\/strong><\/p>\n<\/div>\n<\/section>\n<\/div><\/section><\/div><\/p>\n<\/div><\/div><\/div><div data-av-tab-section-content=\"5\" class=\"av-layout-tab av-animation-delay-container   avia-builder-el-33  el_after_av_tab_sub_section  avia-builder-el-last   \" style='vertical-align:middle; '  data-tab-section-id=\"yearly-reports\"><div class='av-layout-tab-inner'><div class='container'><\/div><\/div><\/div><\/div><\/div><\/div>\n","protected":false},"excerpt":{"rendered":"","protected":false},"author":19,"featured_media":0,"parent":953,"menu_order":0,"comment_status":"closed","ping_status":"closed","template":"","meta":{"_acf_changed":false,"_relevanssi_hide_post":"","_relevanssi_hide_content":"","_relevanssi_pin_for_all":"","_relevanssi_pin_keywords":"","_relevanssi_unpin_keywords":"","_relevanssi_related_keywords":"","_relevanssi_related_include_ids":"","_relevanssi_related_exclude_ids":"","_relevanssi_related_no_append":"","_relevanssi_related_not_related":"","_relevanssi_related_posts":"","_relevanssi_noindex_reason":"","inline_featured_image":false,"episode_type":"","audio_file":"","podmotor_file_id":"","podmotor_episode_id":"","cover_image":"","cover_image_id":"","duration":"","filesize":"","filesize_raw":"","date_recorded":"","explicit":"","block":"","itunes_episode_number":"","itunes_title":"","itunes_season_number":"","itunes_episode_type":"","footnotes":""},"class_list":["post-1163","page","type-page","status-publish","hentry"],"acf":[],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v27.3 - https:\/\/yoast.com\/product\/yoast-seo-wordpress\/ -->\n<title>Challenge 3: LLMs for Spoken Language - LLM4DH<\/title>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/www.cjvt.si\/llm4dh\/en\/work-packages\/work-package-3\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Challenge 3: LLMs for Spoken Language - LLM4DH\" \/>\n<meta property=\"og:url\" content=\"https:\/\/www.cjvt.si\/llm4dh\/en\/work-packages\/work-package-3\/\" \/>\n<meta property=\"og:site_name\" content=\"LLM4DH\" \/>\n<meta property=\"article:modified_time\" content=\"2025-05-14T10:28:56+00:00\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:label1\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data1\" content=\"12 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\\\/\\\/schema.org\",\"@graph\":[{\"@type\":\"WebPage\",\"@id\":\"https:\\\/\\\/www.cjvt.si\\\/llm4dh\\\/en\\\/work-packages\\\/work-package-3\\\/\",\"url\":\"https:\\\/\\\/www.cjvt.si\\\/llm4dh\\\/en\\\/work-packages\\\/work-package-3\\\/\",\"name\":\"Challenge 3: LLMs for Spoken Language - LLM4DH\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/www.cjvt.si\\\/llm4dh\\\/en\\\/#website\"},\"datePublished\":\"2024-12-24T09:48:06+00:00\",\"dateModified\":\"2025-05-14T10:28:56+00:00\",\"breadcrumb\":{\"@id\":\"https:\\\/\\\/www.cjvt.si\\\/llm4dh\\\/en\\\/work-packages\\\/work-package-3\\\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\\\/\\\/www.cjvt.si\\\/llm4dh\\\/en\\\/work-packages\\\/work-package-3\\\/\"]}]},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\\\/\\\/www.cjvt.si\\\/llm4dh\\\/en\\\/work-packages\\\/work-package-3\\\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\\\/\\\/www.cjvt.si\\\/llm4dh\\\/en\\\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Work Packages\",\"item\":\"https:\\\/\\\/www.cjvt.si\\\/llm4dh\\\/en\\\/work-packages\\\/\"},{\"@type\":\"ListItem\",\"position\":3,\"name\":\"Challenge 3: LLMs for Spoken Language\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\\\/\\\/www.cjvt.si\\\/llm4dh\\\/en\\\/#website\",\"url\":\"https:\\\/\\\/www.cjvt.si\\\/llm4dh\\\/en\\\/\",\"name\":\"LLM4DH\",\"description\":\"Work site\",\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\\\/\\\/www.cjvt.si\\\/llm4dh\\\/en\\\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Challenge 3: LLMs for Spoken Language - LLM4DH","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/www.cjvt.si\/llm4dh\/en\/work-packages\/work-package-3\/","og_locale":"en_US","og_type":"article","og_title":"Challenge 3: LLMs for Spoken Language - LLM4DH","og_url":"https:\/\/www.cjvt.si\/llm4dh\/en\/work-packages\/work-package-3\/","og_site_name":"LLM4DH","article_modified_time":"2025-05-14T10:28:56+00:00","twitter_card":"summary_large_image","twitter_misc":{"Est. reading time":"12 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"WebPage","@id":"https:\/\/www.cjvt.si\/llm4dh\/en\/work-packages\/work-package-3\/","url":"https:\/\/www.cjvt.si\/llm4dh\/en\/work-packages\/work-package-3\/","name":"Challenge 3: LLMs for Spoken Language - LLM4DH","isPartOf":{"@id":"https:\/\/www.cjvt.si\/llm4dh\/en\/#website"},"datePublished":"2024-12-24T09:48:06+00:00","dateModified":"2025-05-14T10:28:56+00:00","breadcrumb":{"@id":"https:\/\/www.cjvt.si\/llm4dh\/en\/work-packages\/work-package-3\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/www.cjvt.si\/llm4dh\/en\/work-packages\/work-package-3\/"]}]},{"@type":"BreadcrumbList","@id":"https:\/\/www.cjvt.si\/llm4dh\/en\/work-packages\/work-package-3\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/www.cjvt.si\/llm4dh\/en\/"},{"@type":"ListItem","position":2,"name":"Work Packages","item":"https:\/\/www.cjvt.si\/llm4dh\/en\/work-packages\/"},{"@type":"ListItem","position":3,"name":"Challenge 3: LLMs for Spoken Language"}]},{"@type":"WebSite","@id":"https:\/\/www.cjvt.si\/llm4dh\/en\/#website","url":"https:\/\/www.cjvt.si\/llm4dh\/en\/","name":"LLM4DH","description":"Work site","potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/www.cjvt.si\/llm4dh\/en\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"}]}},"_links":{"self":[{"href":"https:\/\/www.cjvt.si\/llm4dh\/en\/wp-json\/wp\/v2\/pages\/1163","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.cjvt.si\/llm4dh\/en\/wp-json\/wp\/v2\/pages"}],"about":[{"href":"https:\/\/www.cjvt.si\/llm4dh\/en\/wp-json\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"https:\/\/www.cjvt.si\/llm4dh\/en\/wp-json\/wp\/v2\/users\/19"}],"replies":[{"embeddable":true,"href":"https:\/\/www.cjvt.si\/llm4dh\/en\/wp-json\/wp\/v2\/comments?post=1163"}],"version-history":[{"count":7,"href":"https:\/\/www.cjvt.si\/llm4dh\/en\/wp-json\/wp\/v2\/pages\/1163\/revisions"}],"predecessor-version":[{"id":1588,"href":"https:\/\/www.cjvt.si\/llm4dh\/en\/wp-json\/wp\/v2\/pages\/1163\/revisions\/1588"}],"up":[{"embeddable":true,"href":"https:\/\/www.cjvt.si\/llm4dh\/en\/wp-json\/wp\/v2\/pages\/953"}],"wp:attachment":[{"href":"https:\/\/www.cjvt.si\/llm4dh\/en\/wp-json\/wp\/v2\/media?parent=1163"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}