Graphon Estimation in bipartite graphs with observable edge labels and unobservable node labels
This work addresses a specific statistical estimation problem in network analysis, with incremental contributions to graphon estimation methods.
The paper tackles the problem of estimating a graphon (bivariate function) in bipartite graphs with observable edge labels but unobservable node labels, establishing finite sample risk bounds for estimators and proposing an algorithm for approximation. It presents numerical experiments on synthetic data to illustrate empirical performance.
Many real-world data sets can be presented in the form of a matrix whose entries correspond to the interaction between two entities of different natures (number of times a web user visits a web page, a student's grade in a subject, a patient's rating of a doctor, etc.). We assume in this paper that the mentioned interaction is determined by unobservable latent variables describing each entity. Our objective is to estimate the conditional expectation of the data matrix given the unobservable variables. This is presented as a problem of estimation of a bivariate function referred to as graphon. We study the cases of piecewise constant and Hölder-continuous graphons. We establish finite sample risk bounds for the least squares estimator and the exponentially weighted aggregate. These bounds highlight the dependence of the estimation error on the size of the data set, the maximum intensity of the interactions, and the level of noise. As the analyzed least-squares estimator is intractable, we propose an adaptation of Lloyd's alternating minimization algorithm to compute an approximation of the least-squares estimator. Finally, we present numerical experiments in order to illustrate the empirical performance of the graphon estimator on synthetic data sets.