LGAICVASSep 4, 2025

Multimodal Deep Learning for ATCO Command Lifecycle Modeling and Workload Prediction

arXiv:2509.10522v1
Originality Incremental advance
AI Analysis

This work addresses workload prediction for air traffic controllers to improve safety and efficiency in dense airspace, representing a novel domain-specific application.

The paper tackled the problem of modeling air traffic controller command lifecycles by estimating time offsets and command durations using a multimodal deep learning framework, achieving accurate and generalizable predictions on a newly constructed dataset.

Air traffic controllers (ATCOs) issue high-intensity voice commands in dense airspace, where accurate workload modeling is critical for safety and efficiency. This paper proposes a multimodal deep learning framework that integrates structured data, trajectory sequences, and image features to estimate two key parameters in the ATCO command lifecycle: the time offset between a command and the resulting aircraft maneuver, and the command duration. A high-quality dataset was constructed, with maneuver points detected using sliding window and histogram-based methods. A CNN-Transformer ensemble model was developed for accurate, generalizable, and interpretable predictions. By linking trajectories to voice commands, this work offers the first model of its kind to support intelligent command generation and provides practical value for workload assessment, staffing, and scheduling.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes