CVAILGIVJan 22, 2025

Separated Inter/Intra-Modal Fusion Prompts for Compositional Zero-Shot Learning

arXiv:2501.17171v1Natural Language Processing, Information Retrieval and AI Trends 2025
Originality Incremental advance
AI Analysis

This addresses challenges in scene understanding for CZSL, but appears incremental as it builds on existing prompt-based methods.

The paper tackles the problem of accurately recognizing subtle semantic differences and combining states with objects in Compositional Zero-Shot Learning (CZSL) by proposing a method using diverse Prompt Learning with an Inter/Intra-Modality Fusion Synthesizer, resulting in improved attribute recognition performance.

Compositional Zero-Shot Learning (CZSL) aims to recognize subtle differences in meaning or the combination of states and objects through the use of known and unknown concepts during training. Existing methods either focused on prompt configuration or on using prompts to tune the pre-trained Vision-Language model. However, these methods faced challenges in accurately identifying subtle differences in meaning or combining states with objects. To jointly eradicate the above issues and construct an efficient and effective CZSL technique, we suggest a method to improve attribute recognition performance by utilizing diverse Prompt Learning with an Inter/Intra-Modality Fusion Synthesizer in scene understanding involving subtle semantic differences and multiple objects.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes