James H. von Brecht

3papers

106citations

Novelty53%

AI Score27

Ranked #163,016 of 201,326 authors (top 81%)#35,933 in LG (top 84%)

3 Papers

LGMay 29, 2022

Long-Tailed Learning Requires Feature Learning

Thomas Laurent, James H. von Brecht, Xavier Bresson

We propose a simple data model inspired from natural data such as text or images, and use it to study the importance of learning features in order to achieve good generalization. Our data model follows a long-tailed distribution in the sense that some rare subcategories have few representatives in the training set. In this context we provide evidence that a learner succeeds if and only if it identifies the correct features, and moreover derive non-asymptotic generalization error bounds that precisely quantify the penalty that one must pay for not learning features.

LGMay 25, 2023

Feature Collapse

Thomas Laurent, James H. von Brecht, Xavier Bresson

We formalize and study a phenomenon called feature collapse that makes precise the intuitive idea that entities playing a similar role in a learning task receive similar representations. As feature collapse requires a notion of task, we leverage a simple but prototypical NLP task to study it. We start by showing experimentally that feature collapse goes hand in hand with generalization. We then prove that, in the large sample limit, distinct words that play identical roles in this NLP task receive identical local feature representations in a neural network. This analysis reveals the crucial role that normalization mechanisms, such as LayerNorm, play in feature collapse and in generalization.

MLJun 5, 2013

Multiclass Total Variation Clustering

Xavier Bresson, Thomas Laurent, David Uminsky et al.

Ideas from the image processing literature have recently motivated a new set of clustering algorithms that rely on the concept of total variation. While these algorithms perform well for bi-partitioning tasks, their recursive extensions yield unimpressive results for multiclass clustering tasks. This paper presents a general framework for multiclass total variation clustering that does not rely on recursion. The results greatly outperform previous total variation algorithms and compare well with state-of-the-art NMF approaches.