PPI-SVRG: Unifying Prediction-Powered Inference and Variance Reduction for Semi-Supervised Optimization

Ruicheng Ao, Hongyu Chen, Haoyang Liu, David Simchi-Levi, Will Wei Sun

arXiv:2601.21470v14.92 citationsh-index: 4

Originality Incremental advance

AI Analysis

This addresses label scarcity in machine learning by leveraging pre-trained model predictions, offering a stable convergence method with incremental improvements over existing variance reduction techniques.

The paper tackles semi-supervised stochastic optimization with scarce labeled data by unifying prediction-powered inference (PPI) and variance reduction (SVRG) into PPI-SVRG, showing it reduces MSE by 43-52% on benchmarks and improves test accuracy by 2.7-2.9 percentage points on MNIST with 10% labeled data.

We study semi-supervised stochastic optimization when labeled data is scarce but predictions from pre-trained models are available. PPI and SVRG both reduce variance through control variates -- PPI uses predictions, SVRG uses reference gradients. We show they are mathematically equivalent and develop PPI-SVRG, which combines both. Our convergence bound decomposes into the standard SVRG rate plus an error floor from prediction uncertainty. The rate depends only on loss geometry; predictions affect only the neighborhood size. When predictions are perfect, we recover SVRG exactly. When predictions degrade, convergence remains stable but reaches a larger neighborhood. Experiments confirm the theory: PPI-SVRG reduces MSE by 43--52\% under label scarcity on mean estimation benchmarks and improves test accuracy by 2.7--2.9 percentage points on MNIST with only 10\% labeled data.

View on arXiv PDF

Similar