Acquisition of Chess Knowledge in AlphaZero
This addresses the interpretability of neural networks for researchers and practitioners, offering insights into how AI systems learn complex tasks, though it is incremental in probing existing models.
The study investigated whether the AlphaZero neural network acquires human-like chess knowledge during training, finding evidence that it represents a broad range of human chess concepts and providing behavioral analyses including insights from a grandmaster.
What is learned by sophisticated neural network agents such as AlphaZero? This question is of both scientific and practical interest. If the representations of strong neural networks bear no resemblance to human concepts, our ability to understand faithful explanations of their decisions will be restricted, ultimately limiting what we can achieve with neural network interpretability. In this work we provide evidence that human knowledge is acquired by the AlphaZero neural network as it trains on the game of chess. By probing for a broad range of human chess concepts we show when and where these concepts are represented in the AlphaZero network. We also provide a behavioural analysis focusing on opening play, including qualitative analysis from chess Grandmaster Vladimir Kramnik. Finally, we carry out a preliminary investigation looking at the low-level details of AlphaZero's representations, and make the resulting behavioural and representational analyses available online.