MLDec 29, 2013

Probabilistic Archetypal Analysis

arXiv:1312.7604v272 citations
AI Analysis

This work addresses a practical problem for researchers and practitioners in fields like social science and text analysis by enabling archetypal analysis on diverse data types, though it is incremental as it extends an existing method.

The authors tackled the limitation of traditional archetypal analysis to real-valued data by proposing a probabilistic framework that accommodates integer, binary, and probability vectors, and demonstrated its effectiveness through real-world applications like analyzing winter tourists, disaster-affected countries, and document archetypes.

Archetypal analysis represents a set of observations as convex combinations of pure patterns, or archetypes. The original geometric formulation of finding archetypes by approximating the convex hull of the observations assumes them to be real valued. This, unfortunately, is not compatible with many practical situations. In this paper we revisit archetypal analysis from the basic principles, and propose a probabilistic framework that accommodates other observation types such as integers, binary, and probability vectors. We corroborate the proposed methodology with convincing real-world applications on finding archetypal winter tourists based on binary survey data, archetypal disaster-affected countries based on disaster count data, and document archetypes based on term-frequency data. We also present an appropriate visualization tool to summarize archetypal analysis solution better.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes