SEAIOct 10, 2025

Vector Graph-Based Repository Understanding for Issue-Driven File Retrieval

arXiv:2510.08876v1h-index: 1
Originality Incremental advance
AI Analysis

This addresses the challenge of navigating and automating development in complex software projects for developers, though it appears incremental as it builds on existing graph and retrieval techniques.

The paper tackles the problem of understanding large software repositories by converting them into a vectorized knowledge graph that captures architectural and semantic structure, resulting in a system that automates repository development through hybrid retrieval and LLM-based assistance.

We present a repository decomposition system that converts large software repositories into a vectorized knowledge graph which mirrors project architectural and semantic structure, capturing semantic relationships and allowing a significant level of automatization of further repository development. The graph encodes syntactic relations such as containment, implementation, references, calls, and inheritance, and augments nodes with LLM-derived summaries and vector embeddings. A hybrid retrieval pipeline combines semantic retrieval with graph-aware expansion, and an LLM-based assistant formulates constrained, read-only graph requests and produces human-oriented explanations.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes