3.1STApr 20
Conformal Robust Set EstimationAlejandro Cholaquidis, Emilien Joly, Leonardo Moreno
Conformal prediction provides finite-sample, distribution-free coverage under exchangeability, but standard constructions may lack robustness in the presence of outliers or heavy tails. We propose a robust conformal method based on a non-conformity score defined as the half-mass radius around a point, equivalently the distance to its $(\lfloor n/2\rfloor+1)$-nearest neighbour. We show that the resulting conformal regions are marginally valid for any sample size and converge in probability to a robust population central set defined through a distance-to-a-measure functional. Under mild regularity conditions, we establish exponential concentration and tail bounds that quantify the deviation between the empirical conformal region and its population counterpart. These results provide a probabilistic justification for using robust geometric scores in conformal prediction, even for heavy-tailed or multi-modal distributions.
MLOct 12, 2023
Conformal inference for regression on Riemannian ManifoldsAlejandro Cholaquidis, Fabrice Gamboa, Leonardo Moreno
Regression on manifolds, and, more broadly, statistics on manifolds, has garnered significant importance in recent years due to the vast number of applications for non Euclidean data. Circular data is a classic example, but so is data in the space of covariance matrices, data on the Grassmannian manifold obtained as a result of principal component analysis, among many others. In this work we investigate prediction sets for regression scenarios when the response variable, denoted by $Y$, resides in a manifold, and the covariable, denoted by $X$, lies in an Euclidean space. This extends the concepts delineated in \cite{waser14} to this novel context. Aligning with traditional principles in conformal inference, these prediction sets are distribution-free, indicating that no specific assumptions are imposed on the joint distribution of $(X,Y)$, and they maintain a non-parametric character. We prove the asymptotic almost sure convergence of the empirical version of these regions on the manifold to their population counterparts. The efficiency of this method is shown through a comprehensive simulation study and an analysis involving real-world data.