LGAIMAMay 19, 2025

PLAICraft: Large-Scale Time-Aligned Vision-Speech-Action Dataset for Embodied AI

arXiv:2505.12707v15 citationsh-index: 7
Originality Synthesis-oriented
AI Analysis

This addresses the problem of limited training data for embodied AI researchers, though it is incremental as it provides a new dataset rather than a novel method.

The authors tackled the lack of large-scale, real-time, multi-modal datasets for embodied AI by introducing PLAICraft, a dataset with over 10,000 hours of time-aligned gameplay data across five modalities, enabling the study of synchronous behavior in open-ended environments.

Advances in deep generative modelling have made it increasingly plausible to train human-level embodied agents. Yet progress has been limited by the absence of large-scale, real-time, multi-modal, and socially interactive datasets that reflect the sensory-motor complexity of natural environments. To address this, we present PLAICraft, a novel data collection platform and dataset capturing multiplayer Minecraft interactions across five time-aligned modalities: video, game output audio, microphone input audio, mouse, and keyboard actions. Each modality is logged with millisecond time precision, enabling the study of synchronous, embodied behaviour in a rich, open-ended world. The dataset comprises over 10,000 hours of gameplay from more than 10,000 global participants.\footnote{We have done a privacy review for the public release of an initial 200-hour subset of the dataset, with plans to release most of the dataset over time.} Alongside the dataset, we provide an evaluation suite for benchmarking model capabilities in object recognition, spatial awareness, language grounding, and long-term memory. PLAICraft opens a path toward training and evaluating agents that act fluently and purposefully in real time, paving the way for truly embodied artificial intelligence.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes