CVAIDec 4, 2025

Measuring the Unspoken: A Disentanglement Model and Benchmark for Psychological Analysis in the Wild

arXiv:2512.04728v1h-index: 15
Originality Incremental advance
AI Analysis

This work addresses challenges in psychological analysis for AI applications, though it appears incremental as it builds on existing vision-language models with specific improvements.

The paper tackled the problem of generative psychological analysis in in-the-wild conversations by addressing articulatory-affective ambiguity and lack of verifiable metrics, resulting in a +86.95% gain in micro-expression detection over prior state-of-the-art methods.

Generative psychological analysis of in-the-wild conversations faces two fundamental challenges: (1) existing Vision-Language Models (VLMs) fail to resolve Articulatory-Affective Ambiguity, where visual patterns of speech mimic emotional expressions; and (2) progress is stifled by a lack of verifiable evaluation metrics capable of assessing visual grounding and reasoning depth. We propose a complete ecosystem to address these twin challenges. First, we introduce Multilevel Insight Network for Disentanglement(MIND), a novel hierarchical visual encoder that introduces a Status Judgment module to algorithmically suppress ambiguous lip features based on their temporal feature variance, achieving explicit visual disentanglement. Second, we construct ConvoInsight-DB, a new large-scale dataset with expert annotations for micro-expressions and deep psychological inference. Third, Third, we designed the Mental Reasoning Insight Rating Metric (PRISM), an automated dimensional framework that uses expert-guided LLM to measure the multidimensional performance of large mental vision models. On our PRISM benchmark, MIND significantly outperforms all baselines, achieving a +86.95% gain in micro-expression detection over prior SOTA. Ablation studies confirm that our Status Judgment disentanglement module is the most critical component for this performance leap. Our code has been opened.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes