Revisiting Self-Consistency from Dynamic Distributional Alignment Perspective on Answer Aggregation
This work addresses the challenge of optimizing self-consistency for reasoning tasks, offering a method to dynamically adjust sampling parameters, but it is incremental as it builds on existing self-consistency techniques.
The paper tackled the problem of improving self-consistency in reasoning by analyzing it as a dynamic distributional alignment issue, proposing a confidence-driven temperature calibration mechanism that outperforms fixed-diversity baselines under limited samples, enhancing average and best-case performance on mathematical reasoning tasks.
Self-consistency improves reasoning by aggregating diverse stochastic samples, yet the dynamics behind its efficacy remain underexplored. We reframe self-consistency as a dynamic distributional alignment problem, revealing that decoding temperature not only governs sampling randomness but also actively shapes the latent answer distribution. Given that high temperatures require prohibitively large sample sizes to stabilize, while low temperatures risk amplifying biases, we propose a confidence-driven mechanism that dynamically calibrates temperature: sharpening the sampling distribution under uncertainty to align with high-probability modes, and promoting exploration when confidence is high. Experiments on mathematical reasoning tasks show this approach outperforms fixed-diversity baselines under limited samples, improving both average and best-case performance across varying initial temperatures without additional data or modules. This establishes self-consistency as a synchronization challenge between sampling dynamics and evolving answer distributions.