AILGApr 8

Explaining Neural Networks in Preference Learning: a Post-hoc Inductive Logic Programming Approach

arXiv:2604.068386.1h-index: 19
Predicted impact top 98% in AI · last 90 daysOriginality Synthesis-oriented
AI Analysis

This work addresses the challenge of making black-box neural networks interpretable for user preference learning, which is incremental as it applies existing ILASP methods to a new domain with dimensionality reduction.

The paper tackles the problem of explaining neural networks in preference learning by using Inductive Learning of Answer Set Programs (ILASP) as a post-hoc approximator, achieving appropriate fidelity on target models while limiting computational time increases through a preprocessing step with Principal Component Analysis.

In this paper, we propose using Learning from Answer Sets to approximate black-box models, such as Neural Networks (NN), in the specific case of learning user preferences. We specifically explore the use of ILASP (Inductive Learning of Answer Set Programs) to approximate preference learning systems through weak constraints. We have created a dataset on user preferences over a set of recipes, which is used to train the NNs that we aim to approximate with ILASP. Our experiments investigate ILASP both as a global and a local approximator of the NNs. These experiments address the challenge of approximating NNs working on increasingly high-dimensional feature spaces while achieving appropriate fidelity on the target model and limiting the increase in computational time. To handle this challenge, we propose a preprocessing step that exploits Principal Component Analysis to reduce the dataset's dimensionality while keeping our explanations transparent. Under consideration for publication in Theory and Practice of Logic Programming (TPLP).

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes