LG AIMar 16

TabKD: Tabular Knowledge Distillation through Interaction Diversity of Learned Feature Bins

Shovon Niverd Pereira, Krishna Khadka, Yu Lei

arXiv:2603.1548110.91 citationsh-index: 3

AI Analysis

This work provides a principled framework for model compression in privacy-sensitive tabular domains, addressing a specific bottleneck with incremental improvements over existing methods.

The paper tackled the problem of data-free knowledge distillation for tabular data by addressing the lack of explicit feature interaction coverage, proposing TabKD to generate synthetic queries that maximize pairwise interaction diversity, which achieved the highest student-teacher agreement in 14 out of 16 configurations across benchmarks.

Data-free knowledge distillation enables model compression without original training data, critical for privacy-sensitive tabular domains. However, existing methods does not perform well on tabular data because they do not explicitly address feature interactions, the fundamental way tabular models encode predictive knowledge. We identify interaction diversity, systematic coverage of feature combinations, as an essential requirement for effective tabular distillation. To operationalize this insight, we propose TabKD, which learns adaptive feature bins aligned with teacher decision boundaries, then generates synthetic queries that maximize pairwise interaction coverage. Across 4 benchmark datasets and 4 teacher architectures, TabKD achieves highest student-teacher agreement in 14 out of 16 configurations, outperforming 5 state-of-the-art baselines. We further show that interaction coverage strongly correlates with distillation quality, validating our core hypothesis. Our work establishes interaction-focused exploration as a principled framework for tabular model extraction.

View on arXiv PDF

Similar