LGApr 17

Majority Voting for Code Generation

Tim Launer, Jonas Hübotter, Marco Bagatella, Ido Hakimi, Andreas Krause

arXiv:2604.1561866.4h-index: 9

AI Analysis

For practitioners using LLMs for code generation, FMV offers a practical, low-overhead method to improve inference-time performance, though its gains are limited to the base model's capabilities.

The paper investigates Functional Majority Voting (FMV) for code generation, which selects a representative solution from multiple LLM outputs based on runtime execution signatures. FMV substantially boosts performance on LiveCodeBench with low compute overhead, and when used as an aggregation strategy for test-time reinforcement learning, it increases pass@1 on holdout tasks but does not enable self-improvement beyond the base model's ceiling.

We investigate Functional Majority Voting (FMV), a method based on functional consensus for code generation with Large Language Models, which identifies a representative solution from multiple generations using their runtime execution signatures on test inputs. We find that FMV is an effective test-time inference strategy, substantially boosting performance on LiveCodeBench without a large compute overhead. Furthermore, we extend the utility of functional consensus and apply it as an aggregation strategy for label-free Test-Time Reinforcement Learning. We demonstrate that this increases pass@1 on holdout tasks, but find no evidence of self-improvement beyond the base model's performance ceiling.

View on arXiv PDF

Similar