Smooth InfoMax -- Towards Easier Post-Hoc Interpretability
This work addresses the challenge of making deep learning models more interpretable for researchers and practitioners, though it appears incremental as it builds upon existing methods like β-VAEs and Greedy InfoMax.
The paper tackled the problem of improving post-hoc interpretability in self-supervised representation learning by introducing Smooth InfoMax (SIM), which incorporates interpretability constraints into latent representations, resulting in smoother and better-disentangled latent spaces that enhance the effectiveness of post-hoc interpretability methods across layers on speech data.
We introduce Smooth InfoMax (SIM), a self-supervised representation learning method that incorporates interpretability constraints into the latent representations at different depths of the network. Based on $β$-VAEs, SIM's architecture consists of probabilistic modules optimized locally with the InfoNCE loss to produce Gaussian-distributed representations regularized toward the standard normal distribution. This creates smooth, well-defined, and better-disentangled latent spaces, enabling easier post-hoc analysis. Evaluated on speech data, SIM preserves the large-scale training benefits of Greedy InfoMax while improving the effectiveness of post-hoc interpretability methods across layers.