LGAIJan 2

Interpretability-Guided Bi-objective Optimization: Aligning Accuracy and Explainability

arXiv:2601.00655v2
Originality Incremental advance
AI Analysis

This work addresses the challenge of balancing model performance with interpretability for domain experts, though it is incremental in its approach.

The paper tackles the problem of training interpretable models by aligning accuracy and explainability through a bi-objective optimization framework, achieving minimal accuracy loss while enforcing interpretability constraints on time-series data.

This paper introduces Interpretability-Guided Bi-objective Optimization (IGBO), a framework that trains interpretable models by incorporating structured domain knowledge via a bi-objective formulation. IGBO encodes feature importance hierarchies as a Directed Acyclic Graph (DAG) via Central Limit Theorem-based construction and uses Temporal Integrated Gradients (TIG) to measure feature importance. To address the Out-of-Distribution (OOD) problem in TIG computation, we propose an Optimal Path Oracle that learns data-manifold-aware integration paths. Theoretical analysis establishes convergence properties via a geometric projection mapping $\mathcal{P}$ and proves robustness to mini-batch noise. Central Limit Theorem-based construction of the interpretability DAG ensures statistical validity of edge orientation decisions. Empirical results on time-series data demonstrate IGBO's effectiveness in enforcing DAG constraints with minimal accuracy loss, outperforming standard regularization baselines.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes