LGMar 31

Task Scarcity and Label Leakage in Relational Transfer Learning

Francisco Galuppo Azevedo, Clarissa Lima Loures, Denis Oliveira Correa

arXiv:2603.2991411.9

AI Analysis

This addresses a bottleneck in training relational foundation models by mitigating label leakage due to limited task diversity, which is incremental but important for improving transfer learning in databases.

The paper tackled the problem of task scarcity causing label leakage in relational transfer learning, which degrades transfer performance, and introduced a gradient projection method that improved within-dataset transfer by +0.145 AUROC on average, often recovering near single-task performance.

Training relational foundation models requires learning representations that transfer across tasks, yet available supervision is typically limited to a small number of prediction targets per database. This task scarcity causes learned representations to encode task-specific shortcuts that degrade transfer even within the same schema, a problem we call label leakage. We study this using K-Space, a modular architecture combining frozen pretrained tabular encoders with a lightweight message-passing core. To suppress leakage, we introduce a gradient projection method that removes label-predictive directions from representation updates. On RelBench, this improves within-dataset transfer by +0.145 AUROC on average, often recovering near single-task performance. Our results suggest that limited task diversity, not just limited data, constrains relational foundation models.

View on arXiv PDF

Similar