CVMar 5, 2024

Few-shot Learner Parameterization by Diffusion Time-steps

arXiv:2403.02649v217 citationsh-index: 18Has CodeCVPR
AI Analysis

This work addresses the problem of few-shot learning in computer vision, particularly for fine-grained and customized tasks, by leveraging diffusion models to improve classification accuracy, though it is incremental in building on existing diffusion and adapter techniques.

The paper tackled the challenge of few-shot learning by identifying that diffusion model time-steps can isolate nuanced class attributes, and proposed a method using low-rank adapters to parameterize these attributes for classification, achieving significant performance gains over OpenCLIP and its adapters on fine-grained and customized tasks.

Even when using large multi-modal foundation models, few-shot learning is still challenging -- if there is no proper inductive bias, it is nearly impossible to keep the nuanced class attributes while removing the visually prominent attributes that spuriously correlate with class labels. To this end, we find an inductive bias that the time-steps of a Diffusion Model (DM) can isolate the nuanced class attributes, i.e., as the forward diffusion adds noise to an image at each time-step, nuanced attributes are usually lost at an earlier time-step than the spurious attributes that are visually prominent. Building on this, we propose Time-step Few-shot (TiF) learner. We train class-specific low-rank adapters for a text-conditioned DM to make up for the lost attributes, such that images can be accurately reconstructed from their noisy ones given a prompt. Hence, at a small time-step, the adapter and prompt are essentially a parameterization of only the nuanced class attributes. For a test image, we can use the parameterization to only extract the nuanced class attributes for classification. TiF learner significantly outperforms OpenCLIP and its adapters on a variety of fine-grained and customized few-shot learning tasks. Codes are in https://github.com/yue-zhongqi/tif.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes