AIMay 4

Deciphering Shortcut Learning from an Evolutionary Game Theory Perspective

arXiv:2605.0265839.4

Predicted impact top 85% in AI · last 90 daysOriginality Highly original

AI Analysis

For deep learning researchers, this work offers a formal theoretical framework to understand the origins of shortcut bias, which is a fundamental problem in model robustness and generalization.

This paper provides a theoretical analysis of shortcut learning in deep neural networks using evolutionary game theory, showing that gradient descent primarily optimizes shortcut features while stochastic gradient descent optimizes core features, and reveals the roles of data and optimization noise in shortcut bias formation.

Shortcut learning causes deep learning models to rely on non-essential features within the data. However, its formation in deep neural network training still lacks theoretical understanding. In this paper, we provide a formal definition of core and shortcut features and employ evolutionary game theory to analyze the origins of shortcut bias by modeling data samples as players and their corresponding neural tangent features as strategies, assuming the existence of core and shortcut subnetworks. We find that gradient descent (GD) and stochastic gradient descent (SGD) lead to two distinct stochastically stable states, each corresponding to a different strategy. The former primarily optimizes the shortcut subnetwork, while the latter primarily optimizes the core subnetwork. We investigate the influence of these strategies on shortcut bias through a continuous stochastic differential equation, and reveal the impact of data noise and optimization noise on the formation of shortcut bias. In brief, our work employs evolutionary game theory to characterize the dynamics of shortcut bias formation and provides a theoretical view on its mitigation.

View on arXiv PDF

Similar