NLP-AKG: Few-Shot Construction of NLP Academic Knowledge Graph Based on LLM
This work addresses the need for more comprehensive and specific answers in NLP scientific literature question answering by integrating paper entities and domain concepts, though it is incremental as it builds on existing LLM-based methods.
The authors tackled the problem of incomplete external knowledge structures in scientific literature by proposing a novel knowledge graph framework that captures deep conceptual relations between academic papers, resulting in the construction of NLP-AKG with 620,353 entities and 2,271,584 relations from 60,826 papers, validated on three NLP question answering datasets.
Large language models (LLMs) have been widely applied in question answering over scientific research papers. To enhance the professionalism and accuracy of responses, many studies employ external knowledge augmentation. However, existing structures of external knowledge in scientific literature often focus solely on either paper entities or domain concepts, neglecting the intrinsic connections between papers through shared domain concepts. This results in less comprehensive and specific answers when addressing questions that combine papers and concepts. To address this, we propose a novel knowledge graph framework that captures deep conceptual relations between academic papers, constructing a relational network via intra-paper semantic elements and inter-paper citation relations. Using a few-shot knowledge graph construction method based on LLM, we develop NLP-AKG, an academic knowledge graph for the NLP domain, by extracting 620,353 entities and 2,271,584 relations from 60,826 papers in ACL Anthology. Based on this, we propose a 'sub-graph community summary' method and validate its effectiveness on three NLP scientific literature question answering datasets.