Vector Quantile Regression on Manifolds
This work addresses a gap in statistical methods for analyzing data on non-Euclidean domains, such as spheres and tori, which is incremental as it builds on existing quantile regression and optimal transport theory.
The paper tackles the problem of extending quantile regression to multivariate distributions on manifolds, which is underexplored despite applications in fields like climate science and protein analysis, by proposing a method that defines conditional vector quantile functions on manifolds and demonstrates its efficacy through experiments.
Quantile regression (QR) is a statistical tool for distribution-free estimation of conditional quantiles of a target variable given explanatory features. QR is limited by the assumption that the target distribution is univariate and defined on an Euclidean domain. Although the notion of quantiles was recently extended to multi-variate distributions, QR for multi-variate distributions on manifolds remains underexplored, even though many important applications inherently involve data distributed on, e.g., spheres (climate and geological phenomena), and tori (dihedral angles in proteins). By leveraging optimal transport theory and c-concave functions, we meaningfully define conditional vector quantile functions of high-dimensional variables on manifolds (M-CVQFs). Our approach allows for quantile estimation, regression, and computation of conditional confidence sets and likelihoods. We demonstrate the approach's efficacy and provide insights regarding the meaning of non-Euclidean quantiles through synthetic and real data experiments.