Measuring Compositionality in Representation Learning
This addresses a gap in machine learning for evaluating compositionality in representations, which is incremental as it builds on existing linguistic concepts.
The paper tackles the problem of measuring compositional structure in learned representations, lacking general tools in machine learning, by introducing a procedure that approximates models with inferred primitives and provides formal and empirical characterizations across various settings.
Many machine learning algorithms represent input data with vector embeddings or discrete codes. When inputs exhibit compositional structure (e.g. objects built from parts or procedures from subroutines), it is natural to ask whether this compositional structure is reflected in the the inputs' learned representations. While the assessment of compositionality in languages has received significant attention in linguistics and adjacent fields, the machine learning literature lacks general-purpose tools for producing graded measurements of compositional structure in more general (e.g. vector-valued) representation spaces. We describe a procedure for evaluating compositionality by measuring how well the true representation-producing model can be approximated by a model that explicitly composes a collection of inferred representational primitives. We use the procedure to provide formal and empirical characterizations of compositional structure in a variety of settings, exploring the relationship between compositionality and learning dynamics, human judgments, representational similarity, and generalization.