CVAIJun 2, 2025

Fire360: A Benchmark for Robust Perception and Episodic Memory in Degraded 360-Degree Firefighting Videos

arXiv:2506.02167v11 citationsh-index: 2
Originality Synthesis-oriented
AI Analysis

This addresses the critical need for robust perception in safety-critical domains like firefighting, where breakdowns in situational awareness lead to injuries, but it is incremental as it focuses on benchmarking rather than a new method.

The paper tackles the problem of AI systems failing in degraded environments like firefighting scenarios by introducing Fire360, a benchmark with 228 360-degree videos and five tasks, where human experts achieve 83.5% on a key task while models like GPT-4o lag significantly.

Modern AI systems struggle most in environments where reliability is critical - scenes with smoke, poor visibility, and structural deformation. Each year, tens of thousands of firefighters are injured on duty, often due to breakdowns in situational perception. We introduce Fire360, a benchmark for evaluating perception and reasoning in safety-critical firefighting scenarios. The dataset includes 228 360-degree videos from professional training sessions under diverse conditions (e.g., low light, thermal distortion), annotated with action segments, object locations, and degradation metadata. Fire360 supports five tasks: Visual Question Answering, Temporal Action Captioning, Object Localization, Safety-Critical Reasoning, and Transformed Object Retrieval (TOR). TOR tests whether models can match pristine exemplars to fire-damaged counterparts in unpaired scenes, evaluating transformation-invariant recognition. While human experts achieve 83.5% on TOR, models like GPT-4o lag significantly, exposing failures in reasoning under degradation. By releasing Fire360 and its evaluation suite, we aim to advance models that not only see, but also remember, reason, and act under uncertainty. The dataset is available at: https://uofi.box.com/v/fire360dataset.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes