CVApr 15, 2022

Sensitivity of sparse codes to image distortions

arXiv:2204.07466v12 citationsh-index: 56
Originality Synthesis-oriented
AI Analysis

This highlights a critical limitation of sparse coding for computer vision applications, particularly in invariant object recognition, though it is an incremental analysis of known methods.

The paper demonstrates that sparse codes are highly sensitive to image distortions on MNIST, which may hinder invariant object recognition, showing that a nearest neighbor classifier performs worse on sparse codes than on original images while a linear classifier with sufficient labeled data yields higher accuracy but no better than random feedforward networks.

Sparse coding has been proposed as a theory of visual cortex and as an unsupervised algorithm for learning representations. We show empirically with the MNIST dataset that sparse codes can be very sensitive to image distortions, a behavior that may hinder invariant object recognition. A locally linear analysis suggests that the sensitivity is due to the existence of linear combinations of active dictionary elements with high cancellation. A nearest neighbor classifier is shown to perform worse on sparse codes than original images. For a linear classifier with a sufficiently large number of labeled examples, sparse codes are shown to yield higher accuracy than original images, but no higher than a representation computed by a random feedforward net. Sensitivity to distortions seems to be a basic property of sparse codes, and one should be aware of this property when applying sparse codes to invariant object recognition.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes