CVOct 6, 2025

Conditional Representation Learning for Customized Tasks

arXiv:2510.04564v12 citationsh-index: 13Has Code
Originality Incremental advance
AI Analysis

This addresses the need for more efficient and adaptable representation learning in domains like animal habitat analysis, though it is incremental as it builds on existing vision-language models.

The paper tackles the problem of universal representations not aligning with customized downstream tasks by proposing Conditional Representation Learning (CRL), which extracts tailored representations using user-specified criteria, resulting in improved performance on classification and retrieval tasks.

Conventional representation learning methods learn a universal representation that primarily captures dominant semantics, which may not always align with customized downstream tasks. For instance, in animal habitat analysis, researchers prioritize scene-related features, whereas universal embeddings emphasize categorical semantics, leading to suboptimal results. As a solution, existing approaches resort to supervised fine-tuning, which however incurs high computational and annotation costs. In this paper, we propose Conditional Representation Learning (CRL), aiming to extract representations tailored to arbitrary user-specified criteria. Specifically, we reveal that the semantics of a space are determined by its basis, thereby enabling a set of descriptive words to approximate the basis for a customized feature space. Building upon this insight, given a user-specified criterion, CRL first employs a large language model (LLM) to generate descriptive texts to construct the semantic basis, then projects the image representation into this conditional feature space leveraging a vision-language model (VLM). The conditional representation better captures semantics for the specific criterion, which could be utilized for multiple customized tasks. Extensive experiments on classification and retrieval tasks demonstrate the superiority and generality of the proposed CRL. The code is available at https://github.com/XLearning-SCU/2025-NeurIPS-CRL.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes