CLLGASAug 27, 2021

Improving callsign recognition with air-surveillance data in air-traffic communication

arXiv:2108.12156v120 citations
Originality Incremental advance
AI Analysis

This addresses the problem of enhancing safety and reliability in air-traffic control for pilots and controllers, but it is incremental as it builds on existing ASR methods with domain-specific data integration.

The paper tackled improving callsign recognition in air-traffic communication by using surveillance data to adjust callsign weights in ASR models, resulting in a 28.4% absolute improvement in callsign accuracy and up to 74.2% relative WER reduction.

Automatic Speech Recognition (ASR) can be used as the assistance of speech communication between pilots and air-traffic controllers. Its application can significantly reduce the complexity of the task and increase the reliability of transmitted information. Evidently, high accuracy predictions are needed to minimize the risk of errors. Especially, high accuracy is required in recognition of key information, such as commands and callsigns, used to navigate pilots. Our results prove that the surveillance data containing callsigns can help to considerably improve the recognition of a callsign in an utterance when the weights of probable callsign n-grams are reduced per utterance. In this paper, we investigate two approaches: (1) G-boosting, when callsigns weights are adjusted at language model level (G) and followed by the dynamic decoder with an on-the-fly composition, and (2) lattice rescoring when callsign information is introduced on top of lattices generated using a conventional decoder. Boosting callsign n-grams with the combination of two methods allowed us to gain 28.4% of absolute improvement in callsign recognition accuracy and up to 74.2% of relative improvement in WER of callsign recognition.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes