CV AIFeb 2, 2024

BehAVE: Behaviour Alignment of Video Game Encodings

Nemanja Rašajski, Chintan Trivedi, Konstantinos Makantasis, Antonios Liapis, Georgios N. Yannakakis

arXiv:2402.01335v38.76 citationsh-index: 59Has CodeECCV Workshops

Originality Incremental advance

AI Analysis

This addresses scalability issues in domain randomisation for video understanding, particularly in gaming contexts, though it is incremental in leveraging existing games.

The paper tackles the problem of domain randomisation for vision models by introducing BehAVE, a framework that uses commercial video games without accessing simulation engines, achieving up to 22% improvement in zero-shot transfer to unseen games.

Domain randomisation enhances the transferability of vision models across visually distinct domains with similar content. However, current methods heavily depend on intricate simulation engines, hampering feasibility and scalability. This paper introduces BehAVE, a video understanding framework that utilises existing commercial video games for domain randomisation without accessing their simulation engines. BehAVE taps into the visual diversity of video games for randomisation and uses textual descriptions of player actions to align videos with similar content. We evaluate BehAVE across 25 first-person shooter (FPS) games using various video and text foundation models, demonstrating its robustness in domain randomisation. BehAVE effectively aligns player behavioural patterns and achieves zero-shot transfer to multiple unseen FPS games when trained on just one game. In a more challenging scenario, BehAVE enhances the zero-shot transferability of foundation models to unseen FPS games, even when trained on a game of a different genre, with improvements of up to 22%. BehAVE is available online at https://github.com/nrasajski/BehAVE.

View on arXiv PDF Code

Similar