SESep 21, 2024
N-Version Assessment and Enhancement of Generative AIMarcus Kessel, Colin Atkinson
Generative AI (GAI) holds great potential to improve software engineering productivity, but its untrustworthy outputs, particularly in code synthesis, pose significant challenges. The need for extensive verification and validation (V&V) of GAI-generated artifacts may undermine the potential productivity gains. This paper proposes a way of mitigating these risks by exploiting GAI's ability to generate multiple versions of code and tests to facilitate comparative analysis across versions. Rather than relying on the quality of a single test or code module, this "differential GAI" (D-GAI) approach promotes more reliable quality evaluation through version diversity. We introduce the Large-Scale Software Observatorium (LASSO), a platform that supports D-GAI by executing and analyzing large sets of code versions and tests. We discuss how LASSO enables rigorous evaluation of GAI-generated artifacts and propose its application in both software development and GAI research.
SEMar 22, 2013Code
Lowering the Barrier to Reuse through Test-Driven SearchWerner Janjic, Dietmar Stoll, Philipp Bostan et al.
Dedicated software search engines that index open source software repositories or in-house software assets significantly enhance the chance of finding software components suitable for reuse. However, they still leave the work of evaluating and testing components to the developer. To significantly change the risk-cost-benefit tradeoff involved in software reuse, search engines need to be supported by user friendly environments that deliver code search functionality non-intrusively right to developers' fingertips.
SEJun 7, 2024
Morescient GAI for Software Engineering (Extended Version)Marcus Kessel, Colin Atkinson
The ability of Generative AI (GAI) technology to automatically check, synthesize and modify software engineering artifacts promises to revolutionize all aspects of software engineering. Using GAI for software engineering tasks is consequently one of the most rapidly expanding fields of software engineering research, with over a hundred LLM-based code models having been published since 2021. However, the overwhelming majority of existing code models share a major weakness - they are exclusively trained on the syntactic facet of software, significantly lowering their trustworthiness in tasks dependent on software semantics. To address this problem, a new class of "Morescient" GAI is needed that is "aware" of (i.e., trained on) both the semantic and static facets of software. This, in turn, will require a new generation of software observation platforms capable of generating large quantities of execution observations in a structured and readily analyzable way. In this paper, we present a vision and roadmap for how such "Morescient" GAI models can be engineered, evolved and disseminated according to the principles of open science.