IRAIApr 6

Evaluating Scene-based In-Situ Item Labeling for Immersive Conversational Recommendation

arXiv:2604.0969843.51 citationsh-index: 3
AI Analysis

This work provides a novel evaluation paradigm for in-situ item labeling in ICRS, highlighting key challenges for future research in immersive recommendation.

The paper formalizes Immersive Conversational Recommendation Systems (ICRS) and introduces evaluation metrics for in-situ item labeling. Benchmarking IR-, LLM-, and VLM-based methods across three datasets reveals that existing methods fail to leverage scenario-specific modalities, present redundant information, and poorly anticipate proactive user needs.

The growing ubiquity of Extended Reality (XR) is driving Conversational Recommendation Systems (CRS) toward visually immersive experiences. We formalize this paradigm as Immersive CRS (ICRS), where recommended items are highlighted directly in the user's scene-based visual environment and augmented with in-situ labels. While item recommendation has been widely studied, the problem of how to select and evaluate which information to present as immersive labels remains an open problem. To this end, we introduce a principled categorization of information needs into explicit intent satisfaction and proactive information needs and use these to define novel evaluation metrics for item label selection. We benchmark IR-, LLM-, and VLM-based methods across three datasets and ICRS scenarios: fashion, movie recommendation, and retail shopping. Our evaluation reveals three important limitations of existing methods: (1) they fail to leverage scenario-specific information modalities (e.g., visual cues for fashion, meta-data for retail), (2) they present redundant information that is visually inferable, and (3) they poorly anticipate users' proactive information needs from explicit dialogue alone. In summary, this work provides both a novel evaluation paradigm for in-situ item labeling in ICRS and highlights key challenges for future work.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes