LGJun 16, 2015

Learning with Clustering Structure

arXiv:1506.04908v36 citations
Originality Incremental advance
AI Analysis

This work addresses the problem of enhancing model interpretability and performance in domains like text classification and multiclass prediction, though it appears incremental as it builds on existing clustering methods.

The paper tackles supervised learning by incorporating clustering constraints on features or samples to improve prediction and interpretation, developing a unified optimization framework and efficient algorithms with convergence guarantees.

We study supervised learning problems using clustering constraints to impose structure on either features or samples, seeking to help both prediction and interpretation. The problem of clustering features arises naturally in text classification for instance, to reduce dimensionality by grouping words together and identify synonyms. The sample clustering problem on the other hand, applies to multiclass problems where we are allowed to make multiple predictions and the performance of the best answer is recorded. We derive a unified optimization formulation highlighting the common structure of these problems and produce algorithms whose core iteration complexity amounts to a k-means clustering step, which can be approximated efficiently. We extend these results to combine sparsity and clustering constraints, and develop a new projection algorithm on the set of clustered sparse vectors. We prove convergence of our algorithms on random instances, based on a union of subspaces interpretation of the clustering structure. Finally, we test the robustness of our methods on artificial data sets as well as real data extracted from movie reviews.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes