SE AISep 10, 2024

Generative AI for Requirements Engineering: A Systematic Literature Review

Haowei Cheng, Jati H. Husen, Yijun Lu, Teeradaj Racharak, Nobukazu Yoshioka, Naoyasu Ubayashi, Hironori Washizaki

arXiv:2409.06741v313.240 citationsh-index: 27

Originality Synthesis-oriented

AI Analysis

This review identifies systemic bottlenecks in applying generative AI to requirements engineering, showing that industrial adoption remains nascent with only 1.3% of studies reaching production-level integration.

This systematic literature review analyzed 238 articles on generative AI for requirements engineering, finding that GPT models dominate current applications (67.3%) but research is unevenly distributed across RE phases, with reproducibility (66.8%), hallucinations (63.4%), and interpretability (57.1%) forming tightly interlinked core challenges that must be addressed holistically.

Introduction: Requirements engineering faces challenges due to the handling of increasingly complex software systems. These challenges can be addressed using generative AI. Given that GenAI based RE has not been systematically analyzed in detail, this review examines related research, focusing on trends, methodologies, challenges, and future directions. Methods: A systematic methodology for paper selection, data extraction, and feature analysis is used to comprehensively review 238 articles published from 2019 to 2025 and available from major academic databases. Results: Generative pretrained transformer models dominate current applications (67.3%), but research remains unevenly distributed across RE phases, with analysis (30.0%) and elicitation (22.1%) receiving the most attention, and management (6.8%) underexplored. Three core challenges: reproducibility (66.8%), hallucinations (63.4%), and interpretability (57.1%) form a tightly interlinked triad affecting trust and consistency. Strong correlations (35% cooccurrence) indicate these challenges must be addressed holistically. Industrial adoption remains nascent, with over 90% of studies corresponding to early stage development and only 1.3% reaching production level integration. Conclusions: Evaluation practices show maturity gaps, limited tool and dataset availability, and fragmented benchmarking approaches. Despite the transformative potential of GenAI based RE, several barriers hinder practical adoption. The strong correlations among core challenges demand specialized architectures targeting interdependencies rather than isolated solutions. The limited deployment reflects systemic bottlenecks in generalizability, data quality, and scalable evaluation methods. Successful adoption requires coordinated development across technical robustness, methodological maturity, and governance integration.

View on arXiv PDF

Similar