CYCLMay 29

Traceable by Design: An LLM Pipeline and Dashboard for EU Regulatory Consultation Analysis

arXiv:2605.3099594.3h-index: 5Has Code
AI Analysis

This work provides a tool for policymakers and researchers to efficiently analyze public consultation data, addressing the challenge of manual analysis for large datasets.

This paper introduces an LLM-based pipeline and interactive dashboard designed to analyze large volumes of stakeholder submissions from public regulatory consultations. Applied to 4,322 submissions for the EU's Digital Fairness Act, the system extracted 15,368 topic annotations supported by 20,951 verbatim evidence quotes.

Public consultations generate large volumes of data in the form of stakeholder submissions that are practically unfeasible to analyse manually. We present an end-to-end LLM-based pipeline and interactive dashboard for structured topic extraction from regulatory consultation submissions, demonstrated on the European Commission's Digital Fairness Act (DFA) public call for evidence as a case study. The system processes raw PDF attachments and web-form responses, extracts topic annotations, and grounds every extraction in a verbatim quote from the source text. Applied to 4,322 DFA submissions, the pipeline produced 15,368 topic annotations supported by 20,951 verbatim evidence quotes. Three principles govern the proposed design: verbatim grounding, full traceability, and transparency by design. The dashboard exposes the full extraction dataset through five analytical views, from dataset-level topic overviews to individual paragraph drill-downs, with every result traceable to its source. Beyond the predefined DFA topic categories, the pipeline generated certain stakeholder concerns, such as Age Verification, Payment Processor Censorship, and Digital Ownership, that a fixed-taxonomy approach would have missed. The pipeline is domain-generic; adapting it to a new consultation requires only a prompt update and a new dataset. A live demo is available at https://dfa-dashboard.thalesbertaglia.com/. The code and processed data are publicly available at https://github.com/thalesbertaglia/dfa-dashboard.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes