AIJan 29, 2025
Large Language Models for Single-Step and Multi-Step Flight Trajectory PredictionKaiwei Luo, Jiliu Zhou
Flight trajectory prediction is a critical time series task in aviation. While deep learning methods have shown significant promise, the application of large language models (LLMs) to this domain remains underexplored. This study pioneers the use of LLMs for flight trajectory prediction by reframing it as a language modeling problem. Specifically, We extract features representing the aircraft's position and status from ADS-B flight data to construct a prompt-based dataset, where trajectory waypoints are converted into language tokens. The dataset is then employed to fine-tune LLMs, enabling them to learn complex spatiotemporal patterns for accurate predictions. Comprehensive experiments demonstrate that LLMs achieve notable performance improvements in both single-step and multi-step predictions compared to traditional methods, with LLaMA-3.1 model achieving the highest overall accuracy. However, the high inference latency of LLMs poses a challenge for real-time applications, underscoring the need for further research in this promising direction.
CLMay 14, 2025
Atomic Consistency Preference Optimization for Long-Form Question AnsweringJingfeng Chen, Raghuveer Thirukovalluru, Junlin Wang et al.
Large Language Models (LLMs) often produce factoid hallucinations - plausible yet incorrect answers. A common mitigation strategy is model alignment, which improves factual accuracy by training on curated (factual, non-factual) pairs. However, this approach often relies on a stronger model (e.g., GPT-4) or an external knowledge base to assess factual correctness that may not always be accessible. Addressing this, we propose Atomic Consistency Preference Optimization (ACPO), a self-supervised preference-tuning method that enhances factual accuracy without external supervision. ACPO leverages atomic consistency signals (i.e., the agreement of individual facts across multiple stochastic responses) to identify high- and low-quality data pairs for model alignment. Despite being fully self-supervised, ACPO outperforms the strong supervised alignment baseline by 1.95 points averaged across Phi-3 and Llama3 on the LongFact and BioGen datasets, demonstrating its effectiveness in improving factual reliability without relying on external models or knowledge bases.