LG ITOct 3, 2021

Information-Theoretic Characterization of the Generalization Error for Iterative Semi-Supervised Learning

Haiyun He, Hanshu Yan, Vincent Y. F. Tan

arXiv:2110.00926v49.912 citationsHas Code

Originality Incremental advance

AI Analysis

This work addresses the generalization error problem in semi-supervised learning for researchers, providing theoretical insights and practical regularization strategies, though it is incremental as it builds on existing error analysis.

The authors derived an exact expression for the generalization error in iterative semi-supervised learning, showing that error decreases with iterations for low-variance classes but increases for high-variance ones, and that regularization can mitigate this effect, with experiments on MNIST and CIFAR supporting these findings.

Using information-theoretic principles, we consider the generalization error (gen-error) of iterative semi-supervised learning (SSL) algorithms that iteratively generate pseudo-labels for a large amount of unlabelled data to progressively refine the model parameters. In contrast to most previous works that {\em bound} the gen-error, we provide an {\em exact} expression for the gen-error and particularize it to the binary Gaussian mixture model. Our theoretical results suggest that when the class conditional variances are not too large, the gen-error decreases with the number of iterations, but quickly saturates. On the flip side, if the class conditional variances (and so amount of overlap between the classes) are large, the gen-error increases with the number of iterations. To mitigate this undesirable effect, we show that regularization can reduce the gen-error. The theoretical results are corroborated by extensive experiments on the MNIST and CIFAR datasets in which we notice that for easy-to-distinguish classes, the gen-error improves after several pseudo-labelling iterations, but saturates afterwards, and for more difficult-to-distinguish classes, regularization improves the generalization performance.

View on arXiv PDF Code

Similar