LGAIJul 30, 2025

Teaching the Teacher: Improving Neural Network Distillability for Symbolic Regression via Jacobian Regularization

arXiv:2507.22767v2h-index: 38
Originality Incremental advance
AI Analysis

This work addresses the problem of low-fidelity interpretable models for researchers and practitioners in AI, offering a practical method to enhance symbolic regression from neural networks, though it is incremental as it builds on existing distillation pipelines.

The paper tackles the brittleness of distilling neural networks into symbolic formulas by introducing a Jacobian-based regularizer that encourages smoother teacher functions, resulting in an average 120% relative improvement in the R² score of distilled symbolic models on real-world regression benchmarks.

Distilling large neural networks into simple, human-readable symbolic formulas is a promising path toward trustworthy and interpretable AI. However, this process is often brittle, as the complex functions learned by standard networks are poor targets for symbolic discovery, resulting in low-fidelity student models. In this work, we propose a novel training paradigm to address this challenge. Instead of passively distilling a pre-trained network, we introduce a \textbf{Jacobian-based regularizer} that actively encourages the ``teacher'' network to learn functions that are not only accurate but also inherently smoother and more amenable to distillation. We demonstrate through extensive experiments on a suite of real-world regression benchmarks that our method is highly effective. By optimizing the regularization strength for each problem, we improve the $R^2$ score of the final distilled symbolic model by an average of \textbf{120\% (relative)} compared to the standard distillation pipeline, all while maintaining the teacher's predictive accuracy. Our work presents a practical and principled method for significantly improving the fidelity of interpretable models extracted from complex neural networks.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes