LG AIMay 1

Multi-Perspective Transformers in ARC-AGI-2 Challenge

Caleb Talley, Vedant Tibrewal, Seun Adekunle, Weiwen Dong, Xinyu Wu, Fariha Sheikh

arXiv:2605.0115436.5h-index: 1

AI Analysis

This work addresses the challenge of machine generalization on human-intuitive visual puzzles, but the low evaluation accuracy indicates limited practical impact.

The authors tackled the ARC-AGI-2 benchmark of visual puzzles using a TinyLM model with test-time fine-tuning, achieving 96.1% accuracy on the training set and 21.7% on the evaluation set.

ARC-AGI-2 is a benchmark of human-intuitive visual puzzles that measures a machine's ability to generalize from limited examples, interpret symbolic meaning, and flexibly apply rules in varying contexts. In this paper, we discuss our approach to solving the ARC-AGI-2 puzzles with TinyLM, with additional fine-tuning at test time, including Test-Time-Training (TTT) and Products of Experts (POE). Our model achieves 96.1% accuracy on the training set and 21.7% accuracy on the evaluation set.

View on arXiv PDF

Similar