LGJan 31, 2023

Archetypal Analysis++: Rethinking the Initialization Strategy

arXiv:2301.13748v42 citationsh-index: 16Has Code
Originality Incremental advance
AI Analysis

This addresses a specific bottleneck in matrix factorization for researchers and practitioners, but it is incremental as it adapts existing methods like k-means++.

The paper tackles the problem of poor initialization in archetypal analysis, which leads to sub-optimal solutions due to local minima, by proposing AA++, a probabilistic initialization strategy that outperforms baselines on 15 real-world datasets.

Archetypal analysis is a matrix factorization method with convexity constraints. Due to local minima, a good initialization is essential, but frequently used initialization methods yield either sub-optimal starting points or are prone to get stuck in poor local minima. In this paper, we propose archetypal analysis++ (AA++), a probabilistic initialization strategy for archetypal analysis that sequentially samples points based on their influence on the objective function, similar to $k$-means++. In fact, we argue that $k$-means++ already approximates the proposed initialization method. Furthermore, we suggest to adapt an efficient Monte Carlo approximation of $k$-means++ to AA++. In an extensive empirical evaluation of 15 real-world data sets of varying sizes and dimensionalities and considering two pre-processing strategies, we show that AA++ almost always outperforms all baselines, including the most frequently used ones.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes