LGOct 2, 2022

OCD: Learning to Overfit with Conditional Diffusion Models

Meta AI
arXiv:2210.00471v513 citationsh-index: 63Has Code
Originality Incremental advance
AI Analysis

This work addresses the challenge of efficient per-sample adaptation in machine learning, offering a novel approach that is incremental in its application of diffusion models to weight generation.

The authors tackled the problem of dynamically generating network weights conditioned on input samples to mimic finetuning, using a conditional diffusion model that modifies a single layer and forms ensembles for improved performance. Their method demonstrated applicability across image classification, 3D reconstruction, tabular data, speech separation, and natural language processing, with code made available.

We present a dynamic model in which the weights are conditioned on an input sample x and are learned to match those that would be obtained by finetuning a base model on x and its label y. This mapping between an input sample and network weights is approximated by a denoising diffusion model. The diffusion model we employ focuses on modifying a single layer of the base model and is conditioned on the input, activations, and output of this layer. Since the diffusion model is stochastic in nature, multiple initializations generate different networks, forming an ensemble, which leads to further improvements. Our experiments demonstrate the wide applicability of the method for image classification, 3D reconstruction, tabular data, speech separation, and natural language processing. Our code is available at https://github.com/ShaharLutatiPersonal/OCD

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes