CLAILGJan 31, 2025

Efficient Reasoning with Hidden Thinking

arXiv:2501.19201v148 citationsh-index: 12
Originality Incremental advance
AI Analysis

This addresses efficiency issues in reasoning for multimodal AI systems, though it is incremental as it builds on existing CoT methods.

The paper tackles the inefficiency of verbose textual reasoning in Chain-of-Thought frameworks for Multimodal Large Language Models by proposing Heima, which uses hidden latent representations to condense reasoning, achieving higher generation efficiency while maintaining or improving zero-shot task accuracy on diverse benchmarks.

Chain-of-Thought (CoT) reasoning has become a powerful framework for improving complex problem-solving capabilities in Multimodal Large Language Models (MLLMs). However, the verbose nature of textual reasoning introduces significant inefficiencies. In this work, we propose $\textbf{Heima}$ (as hidden llama), an efficient reasoning framework that leverages reasoning CoTs at hidden latent space. We design the Heima Encoder to condense each intermediate CoT into a compact, higher-level hidden representation using a single thinking token, effectively minimizing verbosity and reducing the overall number of tokens required during the reasoning process. Meanwhile, we design corresponding Heima Decoder with traditional Large Language Models (LLMs) to adaptively interpret the hidden representations into variable-length textual sequence, reconstructing reasoning processes that closely resemble the original CoTs. Experimental results across diverse reasoning MLLM benchmarks demonstrate that Heima model achieves higher generation efficiency while maintaining or even better zero-shot task accuracy. Moreover, the effective reconstruction of multimodal reasoning processes with Heima Decoder validates both the robustness and interpretability of our approach.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes