Coherent Swap Regret and Channel-Proof Learning

arXiv:2606.026556.6

Predicted impact top 64% in QUANT-PH · last 90 daysOriginality Highly original

AI Analysis

For researchers in quantum game theory and online learning, this work provides a rigorous regret framework and algorithm for local quantum deviations, with implications for decentralized learning in quantum games.

The paper introduces coherent swap regret as a benchmark against local CPTP deviations in quantum games, and provides an algorithm achieving O(√(dT log d)) regret. It shows that non-unital use of the recommendation register is the source of hardness, and applies the result to reach ε-approximate separable quantum correlated equilibria in O(max_i d_i log d_i / ε^2) rounds.

External regret certifies stability only against replacing one's behavior by a fixed alternative. In a quantum game, this misses a natural physical move: a player can apply a local completely positive trace-preserving (CPTP) map to the state it actually received or prepared. We introduce coherent swap regret as the regret benchmark against all such local CPTP deviations, and give an algorithm achieving $O(\sqrt{dT\log d})$ coherent swap regret via entropic mirror ascent on the CPTP Choi slice with a fixed-point play rule. The main result is a three-level deviation-class landscape. Replacement channels recover ordinary external regret at rate $Θ(\sqrt{T\log d})$. Unital channels, including unitary deviations and mixtures of unitaries, have zero minimax regret. Deterministic measurement-and-preparation channels already force $Ω(\sqrt{dT\log d})$ regret in the moderate-horizon regime, and this rate is also sufficient for all CPTP deviations. Thus the hardness comes from non-unital use of the recommendation register, not from quantum coherence alone. As an application, decentralized full-information learning in finite quantum games reaches an $\varepsilon$-approximate separable quantum correlated equilibrium after $T=O(\max_i d_i\log d_i/\varepsilon^2)$ rounds. We identify these equilibria with channel-proofness of mediated quantum recommendation protocols, give an SDP audit for local CPTP exploitability applicable to arbitrary finite-dimensional states, and include a probing-bandit extension with pseudo-regret $O(d^{4/3}T^{2/3}(\log d)^{1/3})$ under Haar-random pure-state probes.

View on arXiv PDF

Similar