SE AIFeb 19, 2025

Explore-Construct-Filter: An Automated Framework for Rich and Reliable API Knowledge Graph Construction

Yanbang Sun, Qing Huang, Xiaoxue Ren, Zhenchang Xing, Xiaohong Li, Junjie Wang

arXiv:2502.13412v13.4h-index: 12

Originality Highly original

AI Analysis

This addresses the challenge of building reliable and rich API KGs for tasks like API recommendation and code generation, offering an automated solution that reduces manual effort and noise compared to existing methods.

The paper tackles the problem of constructing API Knowledge Graphs (API KGs) by proposing the Explore-Construct-Filter framework, which uses large language models to automate schema design and filtering, resulting in a 25.2% improvement in F1 score over state-of-the-art methods.

The API Knowledge Graph (API KG) is a structured network that models API entities and their relations, providing essential semantic insights for tasks such as API recommendation, code generation, and API misuse detection. However, constructing a knowledge-rich and reliable API KG presents several challenges. Existing schema-based methods rely heavily on manual annotations to design KG schemas, leading to excessive manual overhead. On the other hand, schema-free methods, due to the lack of schema guidance, are prone to introducing noise, reducing the KG's reliability. To address these issues, we propose the Explore-Construct-Filter framework, an automated approach for API KG construction based on large language models (LLMs). This framework consists of three key modules: 1) KG exploration: LLMs simulate the workflow of annotators to automatically design a schema with comprehensive type triples, minimizing human intervention; 2) KG construction: Guided by the schema, LLMs extract instance triples to construct a rich yet unreliable API KG; 3) KG filtering: Removing invalid type triples and suspicious instance triples to construct a rich and reliable API KG. Experimental results demonstrate that our method surpasses the state-of-the-art method, achieving a 25.2% improvement in F1 score. Moreover, the Explore-Construct-Filter framework proves effective, with the KG exploration module increasing KG richness by 133.6% and the KG filtering module improving reliability by 26.6%. Finally, cross-model experiments confirm the generalizability of our framework.

View on arXiv PDF

Similar