MLSep 28, 2023
Selective Nonparametric Regression via TestingFedor Noskov, Alexander Fishkov, Maxim Panov
Prediction with the possibility of abstention (or selective prediction) is an important problem for error-critical machine learning applications. While well-studied in the classification setup, selective approaches to regression are much less developed. In this work, we consider the nonparametric heteroskedastic regression problem and develop an abstention procedure via testing the hypothesis on the value of the conditional variance at a given point. Unlike existing methods, the proposed one allows to account not only for the value of the variance itself but also for the uncertainty of the corresponding variance predictor. We prove non-asymptotic bounds on the risk of the resulting estimator and show the existence of several different convergence regimes. Theoretical analysis is illustrated with a series of experiments on simulated and real-world data.
MLJul 26, 2023
Optimal Noise Reduction in Dense Mixed-Membership Stochastic Block Models under Diverging Spiked Eigenvalues ConditionFedor Noskov, Maxim Panov
Community detection is one of the most critical problems in modern network science. Its applications can be found in various fields, from protein modeling to social network analysis. Recently, many papers appeared studying the problem of overlapping community detection, where each node of a network may belong to several communities. In this work, we consider Mixed-Membership Stochastic Block Model (MMSB) first proposed by Airoldi et al. MMSB provides quite a general setting for modeling overlapping community structure in graphs. The central question of this paper is to reconstruct relations between communities given an observed network. We compare different approaches and establish the minimax lower bound on the estimation error. Then, we propose a new estimator that matches this lower bound. Theoretical results are proved under fairly general conditions on the considered model. Finally, we illustrate the theory in a series of experiments.
11.1STApr 20
Low-Rank Graphon Estimation: Theory and Applications to Graphon GamesOlga Klopp, Fedor Noskov
We study low-rank estimation of an unknown sparse graphon from sampled network data under operator-norm loss, motivated by targeted interventions in graphon games. Starting from the observed adjacency matrix, we construct low-rank surrogates by singular value thresholding and, for smooth graphons, by block averaging followed by thresholding. We obtain non-asymptotic bounds on both the operator-norm error and the rank of the resulting estimator for stochastic block model, Hölder, and analytic graphons, and we complement these results with minimax lower bounds showing that the rates are essentially sharp for these classes. Our analysis highlights that low rank is valuable here primarily for computation: while it does not improve the minimax operator-norm rate, it yields operator-norm accurate surrogates with substantially smaller rank. We then apply these estimators to linear-quadratic graphon games and derive non-asymptotic stability bounds showing that the welfare loss incurred by using an estimated graphon is controlled by the operator-norm perturbation. This yields near-optimal guarantees for targeted interventions computed from the estimated graphon, together with substantial computational savings. For zero baseline heterogeneity and under a spectral-gap condition, we also establish matching lower bounds for intervention regret. Numerical experiments illustrate the trade-off between statistical accuracy, retained rank, and runtime.
MLDec 25, 2023
Efficient Conformal Prediction under Data HeterogeneityVincent Plassier, Nikita Kotelevskii, Aleksandr Rubashevskii et al.
Conformal Prediction (CP) stands out as a robust framework for uncertainty quantification, which is crucial for ensuring the reliability of predictions. However, common CP methods heavily rely on data exchangeability, a condition often violated in practice. Existing approaches for tackling non-exchangeability lead to methods that are not computable beyond the simplest examples. This work introduces a new efficient approach to CP that produces provably valid confidence sets for fairly general non-exchangeable data distributions. We illustrate the general theory with applications to the challenging setting of federated learning under data heterogeneity between agents. Our method allows constructing provably valid personalized prediction sets for agents in a fully federated way. The effectiveness of the proposed method is demonstrated in a series of experiments on real-world datasets.
STFeb 21, 2025
Dimension-free bounds in high-dimensional linear regression via error-in-operator approachFedor Noskov, Nikita Puchkin, Vladimir Spokoiny
We consider a problem of high-dimensional linear regression with random design. We suggest a novel approach referred to as error-in-operator which does not estimate the design covariance $Σ$ directly but incorporates it into empirical risk minimization. We provide an expansion of the excess prediction risk and derive non-asymptotic dimension-free bounds on the leading term and the remainder. This helps us to show that auxiliary variables do not increase the effective dimension of the problem, provided that parameters of the procedure are tuned properly. We also discuss computational aspects of our method and illustrate its performance with numerical experiments.
MLFeb 7, 2022
Nonparametric Uncertainty Quantification for Single Deterministic Neural NetworkNikita Kotelevskii, Aleksandr Artemenkov, Kirill Fedyanin et al.
This paper proposes a fast and scalable method for uncertainty quantification of machine learning models' predictions. First, we show the principled way to measure the uncertainty of predictions for a classifier based on Nadaraya-Watson's nonparametric estimate of the conditional label distribution. Importantly, the proposed approach allows to disentangle explicitly aleatoric and epistemic uncertainties. The resulting method works directly in the feature space. However, one can apply it to any neural network by considering an embedding of the data induced by the network. We demonstrate the strong performance of the method in uncertainty estimation tasks on text classification problems and a variety of real-world image datasets, such as MNIST, SVHN, CIFAR-100 and several versions of ImageNet.