29.6LGApr 1
LLM-Generated Fault Scenarios for Evaluating Perception-Driven Lane Following in Autonomous Edge SystemsFaezeh Pasandideh, Achim Rettberg
Deploying autonomous vision systems on edge devices faces a critical challenge: resource constraints prevent real-time and predictable execution of comprehensive safety tests. Existing validation methods depend on static datasets or manual fault injection, failing to capture the diverse environmental hazards encountered in real-world deployment. To address this, we introduce a decoupled offline-online fault injection framework. This architecture separates the validation process into two distinct phases: a computationally intensive Offline Phase and a lightweight Online Phase. In the offline phase, we employ Large Language Models (LLMs) to semantically generate structured fault scenarios and Latent Diffusion Models (LDMs) to synthesize high-fidelity sensor degradations. These complex fault dynamics are distilled into a pre-computed lookup table, enabling the edge device to perform real-time fault-aware inference without running heavy AI models locally. We extensively validated this framework on a ResNet18 lane-following model across 460 fault scenarios. Results show that while the model achieves a baseline R^2 of approximately 0.85 on clean data, our generated faults expose significant robustness degradation, with RMSE increasing by up to 99% and within-0.10 localization accuracy dropping to as low as 31.0% under fog conditions, demonstrating the inadequacy of normal-data evaluation for real-world edge AI deployment.
29.0DCMar 19
Hardware Utilization and Inference Performance of Edge Object Detection Under Fault InjectionFaezeh Pasandideh, Mehdi Azarafza, Achim Rettberg
As deep learning models are deployed on resource constrained edge platforms in autonomous driving systems, reli able knowledge of hardware behavior under resource degradation becomes an essential requirement. Therefore, we introduce a systematic characterization of CPU load, GPU utilization, RAM consumption, power draw, throughput, and thermal behaviour of TensorRT-optimized YOLOv10s, YOLOv11s and YOLO2026n pipelines running on NVIDIA Jetson Nano under a large-scale fault injection campaign targeting both lane-following and ob ject detection tasks. Faults are synthesized using a decoupled framework that leverages large language models (LLMs) and latent diffusion models (LDMs), based on original data from our JetBot platform data collection. Results show that across both tasks and both models the inference engines keep GPU occupancy stable, temperature rise under control, and power consumption within safe limits, while memory usage settles into a consistent release pattern after the initial warm-up phase. Object detection tends to show somewhat more variability in memory and thermal behavior, yet both tasks point to the same conclusion: the TensorRT pipelines hold up well even when the input data is heavily degraded. These findings offer a hardware-level view of model reliability that sits alongside, rather than against, the broader body of work focused on inference performance at the edge.
AIJun 5, 2025
Mathematical Reasoning for Unmanned Aerial Vehicles: A RAG-Based Approach for Complex Arithmetic ReasoningMehdi Azarafza, Mojtaba Nayyeri, Faezeh Pasandideh et al.
Autonomous UAV operation necessitates reliable mathematical reasoning for tasks such as trajectory planning and power management. While traditional flight control relies on hardcoded equations, recent Large Language Models (LLMs) offer potential for more flexible problem-solving but struggle with reliably selecting and applying correct mathematical formulations and executing precise multi-step arithmetic. We propose RAG-UAV, a retrieval-augmented generation framework designed to improve the mathematical reasoning of several LLMs (including GPT o1/Turbo, Llama-3.2/3.3, Mistral, and DeepSeek R1) in UAV-specific contexts by providing access to relevant domain literature. To conduct an initial assessment, we introduce the UAV-Math-Bench, a 20-question problem set of UAV-centric mathematical problems across four difficulty levels. Our experiments demonstrate that incorporating retrieval substantially increases exact answer accuracy (achieving up to 75% with o1), reduces instances of incorrect formulation selection (from 25% without RAG to 5\% with RAG), and decreases numerical errors, reducing Mean Squared Error (MSE) by orders of magnitude for the best-performing models. This pilot study indicates that RAG can enable general-purpose LLMs to function as more reliable tools for engineering analysis, although direct real-time flight control requires further investigation and validation on a larger scale. All benchmark data, questions, and answers are publicly available.