LG AI GTNov 11, 2025

Deep (Predictive) Discounted Counterfactual Regret Minimization

Hang Xu, Kai Li, Haobo Fu, Qiang Fu, Junliang Xing, Jian Cheng

arXiv:2511.08174v14.1h-index: 13

Originality Incremental advance

AI Analysis

This work addresses scalability issues for AI in complex games like poker, though it is incremental as it builds on existing CFR and neural approximation methods.

The paper tackles the challenge of applying advanced counterfactual regret minimization (CFR) variants to large imperfect-information games by proposing a model-free neural CFR algorithm that uses variance-reduced sampling and bootstrapping, resulting in faster convergence and stronger adversarial performance in poker games.

Counterfactual regret minimization (CFR) is a family of algorithms for effectively solving imperfect-information games. To enhance CFR's applicability in large games, researchers use neural networks to approximate its behavior. However, existing methods are mainly based on vanilla CFR and struggle to effectively integrate more advanced CFR variants. In this work, we propose an efficient model-free neural CFR algorithm, overcoming the limitations of existing methods in approximating advanced CFR variants. At each iteration, it collects variance-reduced sampled advantages based on a value network, fits cumulative advantages by bootstrapping, and applies discounting and clipping operations to simulate the update mechanisms of advanced CFR variants. Experimental results show that, compared with model-free neural algorithms, it exhibits faster convergence in typical imperfect-information games and demonstrates stronger adversarial performance in a large poker game.

View on arXiv PDF

Similar