Adapting large language models to generate infrastructure descriptions in code

VeMo-IaC

The project addresses the use of large language models for the generation of infrastructure specifications in the form of computer code (IaC). As a basis, we will use the technical specifications and software for the construction of large language models that will be developed in the SloLLaMai project and the infrastructure built for the construction of the necessary language resources (command instructions and documentation requirements) from the SloSBZ project. The research will pay particular attention to the robustness, safety and efficiency of the language technologies. The solution developed will enrich and improve a product already on the market (Spotter).

Specific objectives:

  • Studying the possibilities of using language technologies for the needs of automated construction of infrastructure descriptions in code and selecting the most appropriate language models taking into account the key requirements: reliability, security, low computational complexity of the models.
  • Developing a training set of IaC command prompts and user dialogues for automated generation of infrastructure descriptions (based on the results of the SloSBZ project).
    Performing a detailed evaluation of the training examples, focusing on the quality and correctness of the code examples and taking into account the software code versions.
  • Developing a service/module for automatic generation of infrastructure descriptions and integrating the module into an existing product.

Results:

  • D6.1 – Developing a first roadmap and approaches for the effective use of large language models in IaC: Use and selection of training data, description of possible approaches (February 2024).
  • D6.2 – First version of the software to enable the use of language technologies in IaC, taking into account the constraints of capacity use, robustness and security (August 2025).
  • D6.3 – Refinement of the roadmap and revisions of approaches for the effective use of large language models in IaC: Use and selection of training data, description of approaches to construction (February 2025).
  • D6.4 – Second version of the software enabling the use of language technologies in IaC and taking into account the constraints of capacity use, robustness and security – TRL5 achieved (August 2025).
  • D6.5 – Final version of software enabling the use of language technologies in IaC and taking into account the constraints of capacity use, robustness and security – TRL6 achieved (June 2026).

Project partners:

Project leader:
Partners: