Towards Deep Compositional Networks
This addresses the problem of improving interpretability and efficiency in computer vision models for researchers and practitioners, though it is incremental as it builds on existing hierarchical models.
The paper tackled the lack of explicit structure in CNNs, which causes overfitting and limited generative abilities, by proposing a compositional model that performs comparably to CNNs on discriminative tasks while enabling visualization and faster inference.
Hierarchical feature learning based on convolutional neural networks (CNN) has recently shown significant potential in various computer vision tasks. While allowing high-quality discriminative feature learning, the downside of CNNs is the lack of explicit structure in features, which often leads to overfitting, absence of reconstruction from partial observations and limited generative abilities. Explicit structure is inherent in hierarchical compositional models, however, these lack the ability to optimize a well-defined cost function. We propose a novel analytic model of a basic unit in a layered hierarchical model with both explicit compositional structure and a well-defined discriminative cost function. Our experiments on two datasets show that the proposed compositional model performs on a par with standard CNNs on discriminative tasks, while, due to explicit modeling of the structure in the feature units, affording a straight-forward visualization of parts and faster inference due to separability of the units. Actions