CFR-p: Counterfactual Regret Minimization with Hierarchical Policy Abstraction, and its Application to Two-player Mahjong
This work addresses the problem of scaling CFR to more complex games like Mahjong for researchers and AI game developers, but it is incremental as it adapts an existing method to a new domain.
The authors tackled the challenge of applying Counterfactual Regret Minimization (CFR) to the complex game of two-player Mahjong by developing a hierarchical policy abstraction based on winning policies, resulting in a framework that can be generalized to other imperfect information games.
Counterfactual Regret Minimization(CFR) has shown its success in Texas Hold'em poker. We apply this algorithm to another popular incomplete information game, Mahjong. Compared to the poker game, Mahjong is much more complex with many variants. We study two-player Mahjong by conducting game theoretical analysis and making a hierarchical abstraction to CFR based on winning policies. This framework can be generalized to other imperfect information games.