Learning Sparse Distributions using Iterative Hard Thresholding
This work addresses the challenge of learning sparse distributions, which is important for applications in statistics and machine learning, though it appears to be an incremental extension of existing IHT methods to a new domain.
The authors tackled the problem of learning sparse discrete distributions by applying iterative hard thresholding (IHT), proposing a greedy approximate projection to handle the simplex constraint, and demonstrated that this approach achieves state-of-the-art results in theory and practice.
Iterative hard thresholding (IHT) is a projected gradient descent algorithm, known to achieve state of the art performance for a wide range of structured estimation problems, such as sparse inference. In this work, we consider IHT as a solution to the problem of learning sparse discrete distributions. We study the hardness of using IHT on the space of measures. As a practical alternative, we propose a greedy approximate projection which simultaneously captures appropriate notions of sparsity in distributions, while satisfying the simplex constraint, and investigate the convergence behavior of the resulting procedure in various settings. Our results show, both in theory and practice, that IHT can achieve state of the art results for learning sparse distributions.