MEJan 8
Archetypal cases for questionnaires with nominal multiple choice questionsAleix Alcacer, Irene Epifanio
Archetypal analysis serves as an exploratory tool that interprets a collection of observations as convex combinations of pure (extreme) patterns. When these patterns correspond to actual observations within the sample, they are termed archetypoids. For the first time, we propose applying archetypoid analysis to nominal observations, specifically for identifying archetypal cases from questionnaires featuring nominal multiple-choice questions with a single possible answer. This approach can enhance our understanding of a nominal data set, similar to its application in multivariate contexts. We compare this methodology with the use of archetype analysis and probabilistic archetypal analysis and demonstrate the benefits of this methodology using a real-world example: the German credit dataset.
MEApr 16, 2025
A Survey on Archetypal AnalysisAleix Alcacer, Irene Epifanio, Sebastian Mair et al.
Archetypal analysis (AA) was originally proposed in 1994 by Adele Cutler and Leo Breiman as a computational procedure to extract the distinct aspects called archetypes in observations with each observational record approximated as a mixture (i.e., convex combination) of these archetypes. AA thereby provides straightforward, interpretable, and explainable representations for feature extraction and dimensionality reduction, facilitating the understanding of the structure of high-dimensional data with wide applications throughout the sciences. However, AA also faces challenges, particularly as the associated optimization problem is non-convex. This survey provides researchers and data mining practitioners an overview of methodologies and opportunities that AA has to offer surveying the many applications of AA across disparate fields of science, as well as best practices for modeling data using AA and limitations. The survey concludes by explaining important future research directions concerning AA.
MLJul 16, 2025
Incorporating Fairness Constraints into Archetypal AnalysisAleix Alcacer, Irene Epifanio
Archetypal Analysis (AA) is an unsupervised learning method that represents data as convex combinations of extreme patterns called archetypes. While AA provides interpretable and low-dimensional representations, it can inadvertently encode sensitive attributes, leading to fairness concerns. In this work, we propose Fair Archetypal Analysis (FairAA), a modified formulation that explicitly reduces the influence of sensitive group information in the learned projections. We also introduce FairKernelAA, a nonlinear extension that addresses fairness in more complex data distributions. Our approach incorporates a fairness regularization term while preserving the structure and interpretability of the archetypes. We evaluate FairAA and FairKernelAA on synthetic datasets, including linear, nonlinear, and multi-group scenarios, demonstrating their ability to reduce group separability -- as measured by mean maximum discrepancy and linear separability -- without substantially compromising explained variance. We further validate our methods on the real-world ANSUR I dataset, confirming their robustness and practical utility. The results show that FairAA achieves a favorable trade-off between utility and fairness, making it a promising tool for responsible representation learning in sensitive applications.
APFeb 28, 2020
Finding archetypal patterns for binary questionnairesIsmael Cabero, Irene Epifanio
Archetypal analysis is an exploratory tool that explains a set of observations as mixtures of pure (extreme) patterns. If the patterns are actual observations of the sample, we refer to them as archetypoids. For the first time, we propose to use archetypoid analysis for binary observations. This tool can contribute to the understanding of a binary data set, as in the multivariate case. We illustrate the advantages of the proposed methodology in a simulation study and two applications, one exploring objects (rows) and the other exploring items (columns). One is related to determining student skill set profiles and the other to describing item response functions.
MLOct 1, 2018
Robust multivariate and functional archetypal analysis with application to financial time series analysisJesús Moliner, Irene Epifanio
Archetypal analysis approximates data by means of mixtures of actual extreme cases (archetypoids) or archetypes, which are a convex combination of cases in the data set. Archetypes lie on the boundary of the convex hull. This makes the analysis very sensitive to outliers. A robust methodology by means of M-estimators for classical multivariate and functional data is proposed. This unsupervised methodology allows complex data to be understood even by non-experts. The performance of the new procedure is assessed in a simulation study, where a comparison with a previous methodology for the multivariate case is also carried out, and our proposal obtains favorable results. Finally, robust bivariate functional archetypoid analysis is applied to a set of companies in the S\&P 500 described by two time series of stock quotes. A new graphic representation is also proposed to visualize the results. The analysis shows how the information can be easily interpreted and how even non-experts can gain a qualitative understanding of the data.
MEJan 26, 2016
Functional archetype and archetypoid analysisIrene Epifanio
Archetype and archetypoid analysis can be extended to functional data. Each function is represented as a mixture of actual observations (functional archetypoids) or functional archetypes, which are a mixture of observations in the data set. Well-known Canadian temperature data are used to illustrate the analysis developed. Computational methods are proposed for performing these analyses, based on the coefficients of a basis. Unlike a previous attempt to compute functional archetypes, which was only valid for an orthogonal basis, the proposed methodology can be used for any basis. It is computationally less demanding than the simple approach of discretizing the functions. Multivariate functional archetype and archetypoid analysis are also introduced and applied in an interesting problem about the study of human development around the world over the last 50 years. These tools can contribute to the understanding of a functional data set, as in the multivariate case.