On the Joint Interaction of Models, Data, and Features
This work provides incremental theoretical insights into feature learning for the machine learning community, addressing a fundamental gap in understanding model-data interactions.
The paper tackles the problem of understanding feature learning in deep learning by introducing an interaction tensor for empirical analysis, leading to a conceptual framework that explains phenomena like the Generalization Disagreement Equality and identifies data distributions that break it.
Learning features from data is one of the defining characteristics of deep learning, but our theoretical understanding of the role features play in deep learning is still rudimentary. To address this gap, we introduce a new tool, the interaction tensor, for empirically analyzing the interaction between data and model through features. With the interaction tensor, we make several key observations about how features are distributed in data and how models with different random seeds learn different features. Based on these observations, we propose a conceptual framework for feature learning. Under this framework, the expected accuracy for a single hypothesis and agreement for a pair of hypotheses can both be derived in closed-form. We demonstrate that the proposed framework can explain empirically observed phenomena, including the recently discovered Generalization Disagreement Equality (GDE) that allows for estimating the generalization error with only unlabeled data. Further, our theory also provides explicit construction of natural data distributions that break the GDE. Thus, we believe this work provides valuable new insight into our understanding of feature learning.