MLLGApr 24, 2018

On Learning Sparsely Used Dictionaries from Incomplete Samples

arXiv:1804.09217v11 citations
Originality Highly original
AI Analysis

This addresses a practical issue in applications like hyper-spectral imaging or blood glucose monitoring where data is often incomplete, offering a novel solution for dictionary learning in such settings.

The paper tackles the problem of learning sparsely used dictionaries from incomplete data samples, where only a fraction of entries are observed, and provides provably correct polynomial-time algorithms that converge to the true dictionary with theoretical guarantees.

Most existing algorithms for dictionary learning assume that all entries of the (high-dimensional) input data are fully observed. However, in several practical applications (such as hyper-spectral imaging or blood glucose monitoring), only an incomplete fraction of the data entries may be available. For incomplete settings, no provably correct and polynomial-time algorithm has been reported in the dictionary learning literature. In this paper, we provide provable approaches for learning - from incomplete samples - a family of dictionaries whose atoms have sufficiently "spread-out" mass. First, we propose a descent-style iterative algorithm that linearly converges to the true dictionary when provided a sufficiently coarse initial estimate. Second, we propose an initialization algorithm that utilizes a small number of extra fully observed samples to produce such a coarse initial estimate. Finally, we theoretically analyze their performance and provide asymptotic statistical and computational guarantees.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes