Game-invariant Features Through Contrastive and Domain-adversarial Learning
This addresses the need for more generalizable game vision models that require minimal retraining on new games, though it is incremental as it builds on existing contrastive and adversarial techniques.
The paper tackled the problem of game-image encoders overfitting to game-specific visual styles, which undermines performance on new games, and presented a method combining contrastive and domain-adversarial learning to learn game-invariant features, achieving features that no longer cluster by game after a few epochs on a dataset of 10,000 screenshots from 10 games.
Foundational game-image encoders often overfit to game-specific visual styles, undermining performance on downstream tasks when applied to new games. We present a method that combines contrastive learning and domain-adversarial training to learn game-invariant visual features. By simultaneously encouraging similar content to cluster and discouraging game-specific cues via an adversarial domain classifier, our approach produces embeddings that generalize across diverse games. Experiments on the Bingsu game-image dataset (10,000 screenshots from 10 games) demonstrate that after only a few training epochs, our model's features no longer cluster by game, indicating successful invariance and potential for improved cross-game transfer (e.g., glitch detection) with minimal fine-tuning. This capability paves the way for more generalizable game vision models that require little to no retraining on new games.