Oishik Chowdhury

30.7SEApr 19

A Pilot Study on Detecting Software Design Patterns with Large Language Models: An Empirical Evaluation

Oishik Chowdhury, Bastin Tony Roy Savarimuthu, Sherlock A. Licorish

Design patterns provide reusable solutions to recurring software design problems. Automatically detecting these patterns in source code can help bootstrap new developers' understanding of unfamiliar software system architectures, and can help experienced developers to quickly identify and rectify potential quality issues. While many prior research works have explored graph based and machine-learning based detection techniques, this work evaluates the design pattern recognition capabilities of four Large Language Models and two ensemble approaches consisting three out of the four models. We also compare their performance when prompted with a) Source code, b) PlantUML representation of source code, and c) Text-based descriptions of the source code. We investigate the detection of five design patterns: singleton, adapter, bridge, composite and decorator. Our preliminary results indicate that LLMs show promise for automatically detecting design patterns, with NextCoder and Gemma 3 demonstrating comparatively higher accuracy than other models evaluated, and the ensemble approaches enhancing the overall efficiency of design pattern detection. We identify several directions for future work.

MAMar 3

Social Norm Reasoning in Multimodal Language Models: An Evaluation

Oishik Chowdhury, Anushka Debnath, Bastin Tony Roy Savarimuthu

In Multi-Agent Systems (MAS), agents are designed with social capabilities, allowing them to understand and reason about social concepts such as norms when interacting with others (e.g., inter-robot interactions). In Normative MAS (NorMAS), researchers study how norms develop, and how violations are detected and sanctioned. However, existing research in NorMAS use symbolic approaches (e.g., formal logic) for norm representation and reasoning whose application is limited to simplified environments. In contrast, Multimodal Large Language Models (MLLMs) present promising possibilities to develop software used by robots to identify and reason about norms in a wide variety of complex social situations embodied in text and images. However, prior work on norm reasoning have been limited to text-based scenarios. This paper investigates the norm reasoning competence of five MLLMs by evaluating their ability to answer norm-related questions based on thirty text-based and thirty image-based stories, and comparing their responses against humans. Our results show that MLLMs demonstrate superior performance in norm reasoning in text than in images. GPT-4o performs the best in both modalities offering the most promise for integration with MAS, followed by the free model Qwen-2.5VL. Additionally, all models find reasoning about complex norms challenging.

Oishik Chowdhury

2 Papers