AIOct 23, 2025

Merge and Conquer: Evolutionarily Optimizing AI for 2048

arXiv:2510.20205v1h-index: 1
Originality Synthesis-oriented
AI Analysis

This addresses the problem of AI optimization in dynamic environments like 2048 for researchers, but it is incremental as it applies known methods to a specific game.

The paper tackled optimizing AI for the game 2048 using evolutionary training methods, resulting in a single-agent system that achieved an average increase of 473.2 points per cycle with a correlation of ρ=0.607 across training cycles, while a two-agent system showed little improvement.

Optimizing artificial intelligence (AI) for dynamic environments remains a fundamental challenge in machine learning research. In this paper, we examine evolutionary training methods for optimizing AI to solve the game 2048, a 2D sliding puzzle. 2048, with its mix of strategic gameplay and stochastic elements, presents an ideal playground for studying decision-making, long-term planning, and dynamic adaptation. We implemented two distinct systems: a two-agent metaprompting system where a "thinker" large language model (LLM) agent refines gameplay strategies for an "executor" LLM agent, and a single-agent system based on refining a value function for a limited Monte Carlo Tree Search. We also experimented with rollback features to avoid performance degradation. Our results demonstrate the potential of evolutionary refinement techniques in improving AI performance in non-deterministic environments. The single-agent system achieved substantial improvements, with an average increase of 473.2 points per cycle, and with clear upward trends (correlation $ρ$=0.607) across training cycles. The LLM's understanding of the game grew as well, shown in its development of increasingly advanced strategies. Conversely, the two-agent system did not garner much improvement, highlighting the inherent limits of meta-prompting.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes