CLAICVLGROSep 30, 2025

OceanGym: A Benchmark Environment for Underwater Embodied Agents

arXiv:2509.26536v12 citationsh-index: 37Has Code
Originality Incremental advance
AI Analysis

This work addresses the challenge of developing robust AI for autonomous ocean underwater vehicles, an incremental step in a demanding real-world domain with extreme perceptual and decision-making difficulties.

The authors tackled the problem of advancing AI for underwater embodied agents by introducing OceanGym, the first comprehensive benchmark environment for ocean underwater settings, which reveals substantial performance gaps between state-of-the-art MLLM-driven agents and human experts across eight realistic task domains.

We introduce OceanGym, the first comprehensive benchmark for ocean underwater embodied agents, designed to advance AI in one of the most demanding real-world environments. Unlike terrestrial or aerial domains, underwater settings present extreme perceptual and decision-making challenges, including low visibility, dynamic ocean currents, making effective agent deployment exceptionally difficult. OceanGym encompasses eight realistic task domains and a unified agent framework driven by Multi-modal Large Language Models (MLLMs), which integrates perception, memory, and sequential decision-making. Agents are required to comprehend optical and sonar data, autonomously explore complex environments, and accomplish long-horizon objectives under these harsh conditions. Extensive experiments reveal substantial gaps between state-of-the-art MLLM-driven agents and human experts, highlighting the persistent difficulty of perception, planning, and adaptability in ocean underwater environments. By providing a high-fidelity, rigorously designed platform, OceanGym establishes a testbed for developing robust embodied AI and transferring these capabilities to real-world autonomous ocean underwater vehicles, marking a decisive step toward intelligent agents capable of operating in one of Earth's last unexplored frontiers. The code and data are available at https://github.com/OceanGPT/OceanGym.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes