AI CCFeb 9

Debate is efficient with your time

Jonah Brown-Cohen, Geoffrey Irving, Simon C. Marshall, Ilan Newman, Georgios Piliouras, Mario Szegedy

arXiv:2602.08630v14.41 citationsh-index: 9

Originality Incremental advance

AI Analysis

This work addresses the efficiency of human oversight in AI safety for ensuring reliable verification of computational tasks, though it is incremental as it builds on prior theoretical foundations.

The paper tackles the practical cost of human oversight in AI safety via debate by analyzing the number of queries a judge must make to verify complex tasks, finding that PSPACE/poly problems can be decided with O(log n) queries, showing logarithmic oversight suffices for highly complex problems.

AI safety via debate uses two competing models to help a human judge verify complex computational tasks. Previous work has established what problems debate can solve in principle, but has not analysed the practical cost of human oversight: how many queries must the judge make to the debate transcript? We introduce Debate Query Complexity}(DQC), the minimum number of bits a verifier must inspect to correctly decide a debate. Surprisingly, we find that PSPACE/poly (the class of problems which debate can efficiently decide) is precisely the class of functions decidable with O(log n) queries. This characterisation shows that debate is remarkably query-efficient: even for highly complex problems, logarithmic oversight suffices. We also establish that functions depending on all their input bits require Omega(log n) queries, and that any function computable by a circuit of size s satisfies DQC(f) <= log(s) + 3. Interestingly, this last result implies that proving DQC lower bounds of log(n) + 6 for languages in P would yield new circuit lower bounds, connecting debate query complexity to central questions in circuit complexity.

View on arXiv PDF

Similar