The importance of visual modelling languages in generative software engineering
This addresses the problem of integrating visual and textual information in generative AI for software engineers, but it is incremental as it applies an existing method to a new application area.
The paper investigates the use of multimodal GPT-4, which accepts image and text inputs, for Software Engineering tasks involving diagrams and natural language, finding that it enables novel use cases in this domain.
Multimodal GPTs represent a watershed in the interplay between Software Engineering and Generative Artificial Intelligence. GPT-4 accepts image and text inputs, rather than simply natural language. We investigate relevant use cases stemming from these enhanced capabilities of GPT-4. To the best of our knowledge, no other work has investigated similar use cases involving Software Engineering tasks carried out via multimodal GPTs prompted with a mix of diagrams and natural language.