A Locality Radius Framework for Understanding Relational Inductive Bias in Database Learning
This work addresses schema-level prediction problems for database practitioners, but it is incremental as it formalizes and tests an existing hypothesis about relational inductive bias.
The paper tackles the problem of understanding when multi-hop structural reasoning is necessary in database learning tasks by introducing locality radius as a formal measure of required structural neighborhood. Results show a consistent bias-radius alignment effect across tasks like foreign key prediction and join cost estimation.
Foreign key discovery and related schema-level prediction tasks are often modeled using graph neural networks (GNNs), implicitly assuming that relational inductive bias improves performance. However, it remains unclear when multi-hop structural reasoning is actually necessary. In this work, we introduce locality radius, a formal measure of the minimum structural neighborhood required to determine a prediction in relational schemas. We hypothesize that model performance depends critically on alignment between task locality radius and architectural aggregation depth. We conduct a controlled empirical study across foreign key prediction, join cost estimation, blast radius regression, cascade impact classification, and additional graph-derived schema tasks. Our evaluation includes multi-seed experiments, capacity-matched comparisons, statistical significance testing, scaling analysis, and synthetic radius-controlled benchmarks. Results reveal a consistent bias-radius alignment effect.