AIFeb 6
Improved Upper Bounds for Slicing the HypercubeDuncan Soiffer, Nathaniel Itty, Christopher D. Rosin et al.
A collection of hyperplanes $\mathcal{H}$ slices all edges of the $n$-dimensional hypercube $Q_n$ with vertex set $\{-1,1\}^n$ if, for every edge $e$ in the hypercube, there exists a hyperplane in $\mathcal{H}$ intersecting $e$ in its interior. Let $S(n)$ be the minimum number of hyperplanes needed to slice $Q_n$. We prove that $S(n) \leq \lceil \frac{4n}{5} \rceil$, except when $n$ is an odd multiple of $5$, in which case $S(n) \leq \frac{4n}{5} +1$. This improves upon the previously known upper bound of $S(n) \leq \lceil\frac{5n}{6} \rceil$ due to Paterson reported in 1971. We also obtain new lower bounds on the maximum number of edges in $Q_n$ that can be sliced using $k<n$ hyperplanes. We prove the improved upper bound on $S(n)$ by constructing $8$ hyperplanes slicing $Q_{10}$ aided by the recently introduced CPro1: an automatic tool that uses reasoning LLMs coupled with automated hyperparameter tuning to create search algorithms for the discovery of mathematical constructions.
CLSep 26, 2025
Semantic Agreement Enables Efficient Open-Ended LLM CascadesDuncan Soiffer, Steven Kolawole, Virginia Smith
Cascade systems route computational requests to smaller models when possible and defer to larger models only when necessary, offering a promising approach to balance cost and quality in LLM deployment. However, they face a fundamental challenge in open-ended text generation: determining output reliability when generation quality lies on a continuous spectrum, often with multiple valid responses. To address this, we propose semantic agreement -- meaning-level consensus between ensemble outputs -- as a training-free signal for reliable deferral. We show that when diverse model outputs agree semantically, their consensus is a stronger reliability signal than token-level confidence. Evaluated from 500M to 70B-parameter models, we find that semantic cascades match or surpass target-model quality at 40% of the cost and reduce latency by up to 60%. Our method requires no model internals, works across black-box APIs, and remains robust to model updates, making it a practical baseline for real-world LLM deployment.