LG AIApr 15

C-voting: Confidence-Based Test-Time Voting without Explicit Energy Functions

Kenji Kubo, Shunsuke Kamiya, Masanori Koyama, Kohei Hayashi, Yusuke Iwasawa, Yutaka Matsuo

arXiv:2604.1352120.5h-index: 21

Predicted impact top 27% in LG · last 90 daysOriginality Incremental advance

AI Analysis

For researchers working on test-time scaling in recurrent models, C-voting provides a simple, confidence-based alternative to energy-based methods that does not require an explicit energy function, enabling broader applicability.

The authors propose C-voting, a test-time scaling strategy for recurrent neural networks that selects the latent candidate with the highest average top-1 probability, improving accuracy on Sudoku-hard by 4.9% over energy-based voting. Combined with their ItrSA++ model, it achieves 95.2% on Sudoku-extreme (vs. 55.0% for HRM) and 78.6% on Maze (vs. 74.5%).

Neural network models with latent recurrent processing, where identical layers are recursively applied to the latent state, have gained attention as promising models for performing reasoning tasks. A strength of such models is that they enable test-time scaling, where the models can enhance their performance in the test phase without additional training. Models such as the Hierarchical Reasoning Model (HRM) and Artificial Kuramoto Oscillatory Neurons (AKOrN) can facilitate deeper reasoning by increasing the number of recurrent steps, thereby enabling the completion of challenging tasks, including Sudoku, Maze solving, and AGI benchmarks. In this work, we introduce confidence-based voting (C-voting), a test-time scaling strategy designed for recurrent models with multiple latent candidate trajectories. Initializing the latent state with multiple candidates using random variables, C-voting selects the one maximizing the average of top-1 probabilities of the predictions, reflecting the model's confidence. Additionally, it yields 4.9% higher accuracy on Sudoku-hard than the energy-based voting strategy, which is specific to models with explicit energy functions. An essential advantage of C-voting is its applicability: it can be applied to recurrent models without requiring an explicit energy function. Finally, we introduce a simple attention-based recurrent model with randomized initial values named ItrSA++, and demonstrate that when combined with C-voting, it outperforms HRM on Sudoku-extreme (95.2% vs. 55.0%) and Maze (78.6% vs. 74.5%) tasks.

View on arXiv PDF

Similar