AIIRLGAug 12, 2024

Perceptual Similarity for Measuring Decision-Making Style and Policy Diversity in Games

arXiv:2408.06051v22 citationsh-index: 2
Originality Incremental advance
AI Analysis

This work addresses the challenge of quantifying playstyle diversity in gaming, which is incremental but improves analysis and AI development for diverse strategies.

The paper tackled the problem of measuring decision-making styles in games by enhancing an existing unsupervised metric, achieving over 90% accuracy in zero-shot playstyle classification with fewer than 512 observation-action pairs across multiple games.

Defining and measuring decision-making styles, also known as playstyles, is crucial in gaming, where these styles reflect a broad spectrum of individuality and diversity. However, finding a universally applicable measure for these styles poses a challenge. Building on Playstyle Distance, the first unsupervised metric to measure playstyle similarity based on game screens and raw actions, we introduce three enhancements to increase accuracy: multiscale analysis with varied state granularity, a perceptual kernel rooted in psychology, and the utilization of the intersection-over-union method for efficient evaluation. These innovations not only advance measurement precision but also offer insights into human cognition of similarity. Across two racing games and seven Atari games, our techniques significantly improve the precision of zero-shot playstyle classification, achieving an accuracy exceeding 90 percent with fewer than 512 observation-action pairs, which is less than half an episode of these games. Furthermore, our experiments with 2048 and Go demonstrate the potential of discrete playstyle measures in puzzle and board games. We also develop an algorithm for assessing decision-making diversity using these measures. Our findings improve the measurement of end-to-end game analysis and the evolution of artificial intelligence for diverse playstyles.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes