Rel-HNN: Split Parallel Hypergraph Neural Network for Learning on Relational Databases
This addresses the challenge of applying deep learning to relational data for enterprise applications, offering a novel method with incremental improvements over existing graph-based approaches.
The paper tackled the problem of learning from relational databases by proposing rel-HNN, a hypergraph-based framework that models attribute-value pairs as nodes and tuples as hyperedges to capture fine-grained intra-tuple relationships, achieving significant performance improvements in classification and regression tasks with speedups up to 3.18x in training.
Relational databases (RDBs) are ubiquitous in enterprise and real-world applications. Flattening the database poses challenges for deep learning models that rely on fixed-size input representations to capture relational semantics from the structured nature of relational data. Graph neural networks (GNNs) have been proposed to address this, but they often oversimplify relational structures by modeling all the tuples as monolithic nodes and ignoring intra-tuple associations. In this work, we propose a novel hypergraph-based framework, that we call rel-HNN, which models each unique attribute-value pair as a node and each tuple as a hyperedge, enabling the capture of fine-grained intra-tuple relationships. Our approach learns explicit multi-level representations across attribute-value, tuple, and table levels. To address the scalability challenges posed by large RDBs, we further introduce a split-parallel training algorithm that leverages multi-GPU execution for efficient hypergraph learning. Extensive experiments on real-world and benchmark datasets demonstrate that rel-HNN significantly outperforms existing methods in both classification and regression tasks. Moreover, our split-parallel training achieves substantial speedups -- up to 3.18x for learning on relational data and up to 2.94x for hypergraph learning -- compared to conventional single-GPU execution.