CVMay 9, 2024

Vision-Language Modeling with Regularized Spatial Transformer Networks for All Weather Crosswind Landing of Aircraft

arXiv:2405.05574v22 citationsICVGIP
Originality Incremental advance
AI Analysis

This work addresses safety and reliability issues for pilots during manual landing in adverse conditions, though it appears incremental as it builds on existing vision-language and spatial transformer methods.

The paper tackles the problem of vision-based manual aircraft landing in harsh weather and crosswind conditions by proposing a system that synthesizes weather-degraded images and corrects for projective distortions, resulting in improved visibility and reduced localization error for runway object detection.

The intrinsic capability of the Human Vision System (HVS) to perceive depth of field and failure of Instrument Landing Systems (ILS) stimulates a pilot to perform a vision-based manual landing over an autoland approach. However, harsh weather creates challenges, and a pilot must have a clear view of runway elements before the minimum decision altitude. To aid in manual landing, a vision-based system trained to clear weather-induced visual degradations requires a robust landing dataset under various climatic conditions. Nevertheless, to acquire a dataset, flying an aircraft in dangerous weather impacts safety. Also, this system fails to generate reliable warnings, as localization of runway elements suffers from projective distortion while landing at crosswind. To combat, we propose to synthesize harsh weather landing images by training a prompt-based climatic diffusion network. Also, we optimize a weather distillation model using a novel diffusion-distillation loss to learn to clear these visual degradations. Precisely, the distillation model learns an inverse relationship with the diffusion network. Inference time, pre-trained distillation network directly clears weather-impacted onboard camera images, which can be further projected to display devices for improved visibility.Then, to tackle crosswind landing, a novel Regularized Spatial Transformer Networks (RuSTaN) module accurately warps landing images. It minimizes the localization error of runway object detector and helps generate reliable internal software warnings. Finally, we curated an aircraft landing dataset (AIRLAD) by simulating a landing scenario under various weather degradations and experimentally validated our contributions.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes