Cluster-Based Generalized Additive Models Informed by Random Fourier Features
This work addresses the need for interpretable models in domains where black-box methods are opaque, though it is incremental as it builds on existing GAM and RFF techniques.
The paper tackled the problem of balancing prediction accuracy and model interpretability in machine learning by introducing a mixture of generalized additive models (GAMs) that uses random Fourier features (RFF) to uncover local data structure, resulting in improved predictive performance on real-world regression benchmarks such as California Housing, NASA Airfoil Self-Noise, and Bike Sharing datasets.
Explainable machine learning aims to strike a balance between prediction accuracy and model transparency, particularly in settings where black-box predictive models, such as deep neural networks or kernel-based methods, achieve strong empirical performance but remain difficult to interpret. This work introduces a mixture of generalized additive models (GAMs) in which random Fourier feature (RFF) representations are leveraged to uncover locally adaptive structure in the data. In the proposed method, an RFF-based embedding is first learned and then compressed via principal component analysis. The resulting low-dimensional representations are used to perform soft clustering of the data through a Gaussian mixture model. These cluster assignments are then applied to construct a mixture-of-GAMs framework, where each local GAM captures nonlinear effects through interpretable univariate smooth functions. Numerical experiments on real-world regression benchmarks, including the California Housing, NASA Airfoil Self-Noise, and Bike Sharing datasets, demonstrate improved predictive performance relative to classical interpretable models. Overall, this construction provides a principled approach for integrating representation learning with transparent statistical modeling.