A statistical physics approach to learning curves for the Inverse Ising problem
This work provides theoretical insights into learning curves for statistical physics models, but it is incremental as it builds on existing methods like pseudo-likelihood and replica-cavity approaches.
The authors tackled the inverse Ising problem by analyzing reconstruction errors for learning couplings from data, showing that a quadratic cost estimator achieves minimal error with prior knowledge, while a mean field estimator is asymptotically optimal without it, with theory matching simulations in high-temperature random models.
Using methods of statistical physics, we analyse the error of learning couplings in large Ising models from independent data (the inverse Ising problem). We concentrate on learning based on local cost functions, such as the pseudo-likelihood method for which the couplings are inferred independently for each spin. Assuming that the data are generated from a true Ising model, we compute the reconstruction error of the couplings using a combination of the replica method with the cavity approach for densely connected systems. We show that an explicit estimator based on a quadratic cost function achieves minimal reconstruction error, but requires the length of the true coupling vector as prior knowledge. A simple mean field estimator of the couplings which does not need such knowledge is asymptotically optimal, i.e. when the number of observations is much large than the number of spins. Comparison of the theory with numerical simulations shows excellent agreement for data generated from two models with random couplings in the high temperature region: a model with independent couplings (Sherrington-Kirkpatrick model), and a model where the matrix of couplings has a Wishart distribution.