SEAIMay 21, 2024

PathOCL: Path-Based Prompt Augmentation for OCL Generation with GPT-4

arXiv:2405.12450v214 citationsh-index: 482024 IEEE/ACM First International Conference on AI Foundation Models and Software Engineering (Forge) Conference Acronym:
Originality Incremental advance
AI Analysis

This work addresses a specific bottleneck in AI-assisted software development for developers using UML models, but it is incremental as it builds on existing chunking methods for prompt augmentation.

The authors tackled the challenge of generating Object Constraint Language (OCL) constraints from UML models using GPT-4, which is limited by token processing constraints, by introducing PathOCL, a path-based prompt augmentation technique that selectively includes relevant UML classes; the result showed that PathOCL generated more valid and correct OCL constraints and significantly reduced average prompt size compared to augmenting the complete UML model.

The rapid progress of AI-powered programming assistants, such as GitHub Copilot, has facilitated the development of software applications. These assistants rely on large language models (LLMs), which are foundation models (FMs) that support a wide range of tasks related to understanding and generating language. LLMs have demonstrated their ability to express UML model specifications using formal languages like the Object Constraint Language (OCL). However, the context size of the prompt is limited by the number of tokens an LLM can process. This limitation becomes significant as the size of UML class models increases. In this study, we introduce PathOCL, a novel path-based prompt augmentation technique designed to facilitate OCL generation. PathOCL addresses the limitations of LLMs, specifically their token processing limit and the challenges posed by large UML class models. PathOCL is based on the concept of chunking, which selectively augments the prompts with a subset of UML classes relevant to the English specification. Our findings demonstrate that PathOCL, compared to augmenting the complete UML class model (UML-Augmentation), generates a higher number of valid and correct OCL constraints using the GPT-4 model. Moreover, the average prompt size crafted using PathOCL significantly decreases when scaling the size of the UML class models.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes