EnCoBo: Energy-Guided Concept Bottlenecks for Interpretable Generation
This work addresses interpretability and intervention capabilities in generative models for researchers and practitioners, though it appears incremental as it builds on existing CBM frameworks.
The paper tackled the problem of generative Concept Bottleneck Models (CBMs) relying on auxiliary visual cues that undermine interpretability, proposing EnCoBo, a post-hoc concept bottleneck that eliminates these cues and improved concept-level human intervention and interpretability while maintaining competitive visual quality on CelebA-HQ and CUB datasets.
Concept Bottleneck Models (CBMs) provide interpretable decision-making through explicit, human-understandable concepts. However, existing generative CBMs often rely on auxiliary visual cues at the bottleneck, which undermines interpretability and intervention capabilities. We propose EnCoBo, a post-hoc concept bottleneck for generative models that eliminates auxiliary cues by constraining all representations to flow solely through explicit concepts. Unlike autoencoder-based approaches that inherently rely on black-box decoders, EnCoBo leverages a decoder-free, energy-based framework that directly guides generation in the latent space. Guided by diffusion-scheduled energy functions, EnCoBo supports robust post-hoc interventions-such as concept composition and negation-across arbitrary concepts. Experiments on CelebA-HQ and CUB datasets showed that EnCoBo improved concept-level human intervention and interpretability while maintaining competitive visual quality.