LG AI CL CVMay 11, 2025

Technical Report: Quantifying and Analyzing the Generalization Power of a DNN

Yuxuan He, Junpeng Zhang, Lei Cheng, Hongyuan Zhang, Quanshi Zhang

arXiv:2505.06993v27.11 citationsh-index: 29

Originality Incremental advance

AI Analysis

This provides a new perspective for understanding generalization gaps in deep learning, which is incremental but addresses a core challenge for researchers and practitioners.

The paper tackles the problem of analyzing generalization in deep neural networks by quantifying the dynamics of generalizable and non-generalizable interactions during training, discovering a three-phase pattern where early phases learn simple interactions and later phases capture complex, less generalizable ones.

This paper proposes a new perspective for analyzing the generalization power of deep neural networks (DNNs), i.e., directly disentangling and analyzing the dynamics of generalizable and non-generalizable interaction encoded by a DNN through the training process. Specifically, this work builds upon the recent theoretical achievement in explainble AI, which proves that the detailed inference logic of DNNs can be can be strictly rewritten as a small number of AND-OR interaction patterns. Based on this, we propose an efficient method to quantify the generalization power of each interaction, and we discover a distinct three-phase dynamics of the generalization power of interactions during training. In particular, the early phase of training typically removes noisy and non-generalizable interactions and learns simple and generalizable ones. The second and the third phases tend to capture increasingly complex interactions that are harder to generalize. Experimental results verify that the learning of non-generalizable interactions is the the direct cause for the gap between the training and testing losses.

View on arXiv PDF

Similar