CVCLAug 28, 2025

Improving Alignment in LVLMs with Debiased Self-Judgment

arXiv:2508.20655v22 citationsh-index: 13EMNLP
Originality Incremental advance
AI Analysis

This addresses alignment challenges in LVLMs for applications requiring reliable multimodal integration, though it appears incremental as it builds on existing tuning methods.

The paper tackles the problem of aligning visual and linguistic modalities in Large Visual-Language Models to reduce hallucinations and safety concerns, proposing a debiased self-judgment score that enables autonomous improvement and significantly outperforms traditional methods.

The rapid advancements in Large Language Models (LLMs) and Large Visual-Language Models (LVLMs) have opened up new opportunities for integrating visual and linguistic modalities. However, effectively aligning these modalities remains challenging, often leading to hallucinations--where generated outputs are not grounded in the visual input--and raising safety concerns across various domains. Existing alignment methods, such as instruction tuning and preference tuning, often rely on external datasets, human annotations, or complex post-processing, which limit scalability and increase costs. To address these challenges, we propose a novel approach that generates the debiased self-judgment score, a self-evaluation metric created internally by the model without relying on external resources. This enables the model to autonomously improve alignment. Our method enhances both decoding strategies and preference tuning processes, resulting in reduced hallucinations, enhanced safety, and improved overall capability. Empirical results show that our approach significantly outperforms traditional methods, offering a more effective solution for aligning LVLMs.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes