Off-the-grid learning of mixtures from a continuous dictionary
This addresses a fundamental challenge in signal processing and statistics for applications like spike deconvolution, though it appears incremental as it builds on existing off-the-grid methods.
The paper tackles the problem of estimating both linear and non-linear parameters in a mixture model from a continuous dictionary, proposing an off-the-grid optimization method without discretization. It achieves prediction error bounds similar to Lasso rates up to log factors and provides convergence rates for parameter estimation with high probability.
We consider a general non-linear model where the signal is a finite mixture of an unknown, possibly increasing, number of features issued from a continuous dictionary parameterized by a real non-linear parameter. The signal is observed with Gaussian (possibly correlated) noise in either a continuous or a discrete setup. We propose an off-the-grid optimization method, that is, a method which does not use any discretization scheme on the parameter space, to estimate both the non-linear parameters of the features and the linear parameters of the mixture. We use recent results on the geometry of off-the-grid methods to give minimal separation on the true underlying non-linear parameters such that interpolating certificate functions can be constructed. Using also tail bounds for suprema of Gaussian processes we bound the prediction error with high probability. Assuming that the certificate functions can be constructed, our prediction error bound is up to $\log$-factors similar to the rates attained by the Lasso predictor in the linear regression model. We also establish convergence rates that quantify with high probability the quality of estimation for both the linear and the non-linear parameters. We develop in full details our main results for two applications: the Gaussian spike deconvolution and the scaled exponential model.