SE CLMar 23

Generating and Evaluating Sustainable Procurement Criteria for the Swiss Public Sector using In-Context Prompting with Large Language Models

Yingqiang Gao, Veton Matoshi, Luca Rolshoven, Tilia Ellendorff, Judith Binder, Jeremy Austin Jann, Gerold Schneider, Matthias Stürmer

arXiv:2603.2251341.8h-index: 8

Predicted impact top 58% in SE · last 90 daysOriginality Synthesis-oriented

AI Analysis

This addresses a domain-specific problem for public procurement officials in Switzerland by automating a manual, error-prone process, though it is incremental as it applies existing in-context prompting methods to a new application area.

The paper tackles the labor-intensive task of translating high-level sustainability regulations into concrete procurement criteria for the Swiss public sector, and the result is an LLM-assisted pipeline that substantially reduces manual drafting effort while producing criteria consistent with official guidelines.

Public procurement refers to the process by which public sector institutions, such as governments, municipalities, and publicly funded bodies, acquire goods and services. Swiss law requires the integration of ecological, social, and economic sustainability requirements into tender evaluations in the format of criteria that have to be fulfilled by a bidder. However, translating high-level sustainability regulations into concrete, verifiable, and sector-specific procurement criteria (such as selection criteria, award criteria, and technical specifications) remains a labor-intensive and error-prone manual task, requiring substantial domain expertise in several groups of goods and services and considerable manual effort. This paper presents a configurable, LLM-assisted pipeline that is presented as a software supporting the systematic generation and evaluation of sustainability-oriented procurement criteria catalogs for Switzerland. The system integrates in-context prompting, interchangeable LLM backends, and automated output validation to enable auditable criteria generation across different procurement sectors. As a proof of concept, we instantiate the pipeline using official sustainability guidelines published by the Swiss government and the European Commission, which are ingested as structured reference documents. We evaluate the system through a combination of automated quality checks, including an LLM-based evaluation component, and expert comparison against a manually curated gold standard. Our results demonstrate that the proposed pipeline can substantially reduce manual drafting effort while producing criteria catalogs that are consistent with official guidelines. We further discuss system limitations, failure modes, and design trade-offs observed during deployment, highlighting key considerations for integrating generative AI into public sector software workflows.

View on arXiv PDF

Similar