CVLGApr 19, 2021

Image Modeling with Deep Convolutional Gaussian Mixture Models

arXiv:2104.12686v19 citations
Originality Incremental advance
AI Analysis

This work addresses the inefficiency of vanilla GMMs for image modeling, offering a more scalable method for researchers in computer vision and generative modeling, though it is incremental as it builds on existing GMM and deep learning concepts.

The authors tackled the problem of modeling images with Gaussian Mixture Models (GMMs) by introducing Deep Convolutional Gaussian Mixture Models (DCGMMs), which use a stacked architecture with convolution and pooling to reduce the number of components needed, resulting in improved performance over flat GMMs on MNIST and FashionMNIST datasets for tasks like clustering, sampling, and outlier detection.

In this conceptual work, we present Deep Convolutional Gaussian Mixture Models (DCGMMs): a new formulation of deep hierarchical Gaussian Mixture Models (GMMs) that is particularly suitable for describing and generating images. Vanilla (i.e., flat) GMMs require a very large number of components to describe images well, leading to long training times and memory issues. DCGMMs avoid this by a stacked architecture of multiple GMM layers, linked by convolution and pooling operations. This allows to exploit the compositionality of images in a similar way as deep CNNs do. DCGMMs can be trained end-to-end by Stochastic Gradient Descent. This sets them apart from vanilla GMMs which are trained by Expectation-Maximization, requiring a prior k-means initialization which is infeasible in a layered structure. For generating sharp images with DCGMMs, we introduce a new gradient-based technique for sampling through non-invertible operations like convolution and pooling. Based on the MNIST and FashionMNIST datasets, we validate the DCGMMs model by demonstrating its superiority over flat GMMs for clustering, sampling and outlier detection.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes