SEApr 9

AFGNN: API Misuse Detection using Graph Neural Networks and Clustering

Ponnampalam Pirapuraj, Tamal Mondal, Sharanya Gupta, Akash Lal, Somak Aditya, Jyothi Vedurada

arXiv:2604.0789117.1h-index: 6

Predicted impact top 83% in SE · last 90 daysOriginality Incremental advance

AI Analysis

This addresses API misuse detection for software developers to improve code safety, but it appears incremental as it builds on existing GNN and clustering techniques for a specific domain.

The paper tackles the problem of detecting API misuses in Java code, which are a significant source of bugs and vulnerabilities, by presenting AFGNN, a Graph Neural Network-based framework that uses a novel API Flow Graph representation and clustering, and it significantly outperforms state-of-the-art methods in experiments on popular datasets.

Application Programming Interfaces (APIs) are crucial to software development, enabling integration of existing systems with new applications by reusing tried and tested code, saving development time and increasing software safety. In particular, the Java standard library APIs, along with numerous third-party APIs, are extensively utilized in the development of enterprise application software. However, their misuse remains a significant source of bugs and vulnerabilities. Furthermore, due to the limited examples in the official API documentation, developers often rely on online portals and generative AI models to learn unfamiliar APIs, but using such examples may introduce unintentional errors in the software. In this paper, we present AFGNN, a novel Graph Neural Network (GNN)-based framework for efficiently detecting API misuses in Java code. AFGNN uses a novel API Flow Graph (AFG) representation that captures the API execution sequence, data, and control flow information present in the code to model the API usage patterns. AFGNN uses self-supervised pre-training with AFG representation to effectively compute the embeddings for unknown API usage examples and cluster them to identify different usage patterns. Experiments on popular API usage datasets show that AFGNN significantly outperforms state-of-the-art small language models and API misuse detectors.

View on arXiv PDF

Similar