LGFeb 25

NGDB-Zoo: Towards Efficient and Scalable Neural Graph Databases Training

Zhongwei Xie, Jiaxin Bai, Shujie Liu, Haoyu Huang, Yufei Li, Yisen Gao, Hong Ting Tsang, Yangqiu Song

arXiv:2602.21597v12.71 citationsh-index: 5

Originality Highly original

AI Analysis

This addresses scalability and expressivity bottlenecks for researchers and practitioners working with NGDBs in complex reasoning tasks, representing a novel method rather than an incremental improvement.

The paper tackled the problem of inefficient training and limited expressivity in Neural Graph Databases (NGDBs) by introducing NGDB-Zoo, a framework that uses operator-level training and semantic augmentation to achieve 1.8x to 6.8x throughput improvements over baselines and enhances reasoning on large-scale graphs.

Neural Graph Databases (NGDBs) facilitate complex logical reasoning over incomplete knowledge structures, yet their training efficiency and expressivity are constrained by rigid query-level batching and structure-exclusive embeddings. We present NGDB-Zoo, a unified framework that resolves these bottlenecks by synergizing operator-level training with semantic augmentation. By decoupling logical operators from query topologies, NGDB-Zoo transforms the training loop into a dynamically scheduled data-flow execution, enabling multi-stream parallelism and achieving a $1.8\times$ - $6.8\times$ throughput compared to baselines. Furthermore, we formalize a decoupled architecture to integrate high-dimensional semantic priors from Pre-trained Text Encoders (PTEs) without triggering I/O stalls or memory overflows. Extensive evaluations on six benchmarks, including massive graphs like ogbl-wikikg2 and ATLAS-Wiki, demonstrate that NGDB-Zoo maintains high GPU utilization across diverse logical patterns and significantly mitigates representation friction in hybrid neuro-symbolic reasoning.

View on arXiv PDF

Similar