MESTMLOct 1, 2021

Dimension Reduction for Fréchet Regression

arXiv:2110.00467v225 citations
Originality Incremental advance
AI Analysis

This work addresses the curse of dimensionality in regression analysis for complex data objects like metric space-valued responses, offering a practical tool for statisticians and data scientists, though it is incremental as it extends existing SDR methods to this context.

The paper tackles the challenge of high-dimensional predictors in Fréchet regression for non-Euclidean data by introducing a flexible sufficient dimension reduction method, which achieves dimensionality reduction and provides visualization tools, with theoretical guarantees of consistency and asymptotic convergence rates demonstrated through simulations on various metric spaces.

With the rapid development of data collection techniques, complex data objects that are not in the Euclidean space are frequently encountered in new statistical applications. Fréchet regression model (Peterson & Müller 2019) provides a promising framework for regression analysis with metric space-valued responses. In this paper, we introduce a flexible sufficient dimension reduction (SDR) method for Fréchet regression to achieve two purposes: to mitigate the curse of dimensionality caused by high-dimensional predictors and to provide a visual inspection tool for Fréchet regression. Our approach is flexible enough to turn any existing SDR method for Euclidean (X,Y) into one for Euclidean X and metric space-valued Y. The basic idea is to first map the metric-space valued random object $Y$ to a real-valued random variable $f(Y)$ using a class of functions, and then perform classical SDR to the transformed data. If the class of functions is sufficiently rich, then we are guaranteed to uncover the Fréchet SDR space. We showed that such a class, which we call an ensemble, can be generated by a universal kernel. We established the consistency and asymptotic convergence rate of the proposed methods. The finite-sample performance of the proposed methods is illustrated through simulation studies for several commonly encountered metric spaces that include Wasserstein space, the space of symmetric positive definite matrices, and the sphere. We illustrated the data visualization aspect of our method by exploring the human mortality distribution data across countries and by studying the distribution of hematoma density.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes