From Imitation to Interaction: Mastering Game of Schnapsen with Shallow Reinforcement Learning
For game AI researchers, this demonstrates that shallow RL can compete with search-based methods in a complex card game, but the gains are conditional and incremental.
This paper shows that shallow neural network agents can master the card game Schnapsen, with reinforcement learning (RLBot) achieving statistically significant higher winning rates against a strong search-based baseline (RdeepBot) when combined with deeper lookahead, while supervised imitation fails to generalize.
This paper investigates whether shallow neural network agents can master the card game Schnapsen and challenge a strong search-based baseline, RdeepBot, which uses Monte Carlo sampling and lookahead search. Guided by a progressively more complex experimental design, we first evaluate a supervised learning agent (MLPBot) trained on replay data and then a reinforcement learning agent (RLBot) with the same shallow architecture trained through asynchronous Monte Carlo updates and experience replay. The results show that supervised imitation does not generalize well enough to defeat strong RdeepBot opponents, whereas reinforcement learning produces substantially stronger agents. In the setting that focuses on the depth parameter of RdeepBot, the best performance is achieved when the learned value function is combined with deeper lookahead during gameplay, allowing RLBot to achieve statistically significant higher winning rates against the strongest evaluated RdeepBot baseline. In the sample-based setting, the gains are more conditional: the strongest performance appears at a relatively lower training num_samples parameter rather than increasing uniformly with stronger sampling.