Yao Deng

CV
h-index10
14papers
503citations
Novelty39%
AI Score45

14 Papers

CVMay 1, 2025Code
SOTA: Spike-Navigated Optimal TrAnsport Saliency Region Detection in Composite-bias Videos

Wenxuan Liu, Yao Deng, Kang Chen et al.

Existing saliency detection methods struggle in real-world scenarios due to motion blur and occlusions. In contrast, spike cameras, with their high temporal resolution, significantly enhance visual saliency maps. However, the composite noise inherent to spike camera imaging introduces discontinuities in saliency detection. Low-quality samples further distort model predictions, leading to saliency bias. To address these challenges, we propose Spike-navigated Optimal TrAnsport Saliency Region Detection (SOTA), a framework that leverages the strengths of spike cameras while mitigating biases in both spatial and temporal dimensions. Our method introduces Spike-based Micro-debias (SM) to capture subtle frame-to-frame variations and preserve critical details, even under minimal scene or lighting changes. Additionally, Spike-based Global-debias (SG) refines predictions by reducing inconsistencies across diverse conditions. Extensive experiments on real and synthetic datasets demonstrate that SOTA outperforms existing methods by eliminating composite noise bias. Our code and dataset will be released at https://github.com/lwxfight/sota.

ROJan 16
Visual Marker Search for Autonomous Drone Landing in Diverse Urban Environments

Jiaohong Yao, Linfeng Liang, Yao Deng et al.

Marker-based landing is widely used in drone delivery and return-to-base systems for its simplicity and reliability. However, most approaches assume idealized landing site visibility and sensor performance, limiting robustness in complex urban settings. We present a simulation-based evaluation suite on the AirSim platform with systematically varied urban layouts, lighting, and weather to replicate realistic operational diversity. Using onboard camera sensors (RGB for marker detection and depth for obstacle avoidance), we benchmark two heuristic coverage patterns and a reinforcement learning-based agent, analyzing how exploration strategy and scene complexity affect success rate, path efficiency, and robustness. Results underscore the need to evaluate marker-based autonomous landing under diverse, sensor-relevant conditions to guide the development of reliable aerial navigation systems.

CVFeb 28, 2025
OpenEarthSensing: Large-Scale Fine-Grained Benchmark for Open-World Remote Sensing

Xiang Xiang, Zhuo Xu, Yao Deng et al.

The advancement of remote sensing, including satellite systems, facilitates the continuous acquisition of remote sensing imagery globally, introducing novel challenges for achieving open-world tasks. Deployed models need to continuously adjust to a constant influx of new data, which frequently exhibits diverse shifts from the data encountered during the training phase. To effectively handle the new data, models are required to detect semantic shifts, adapt to covariate shifts, and continuously update their parameters without forgetting learned knowledge, as has been considered in works on a variety of open-world tasks. However, existing studies are typically conducted within a single dataset to simulate realistic conditions, with a lack of large-scale benchmarks capable of evaluating multiple open-world tasks. In this paper, we introduce \textbf{OpenEarthSensing (OES)}, a large-scale fine-grained benchmark for open-world remote sensing. OES includes 189 scene and object categories, covering the vast majority of potential semantic shifts that may occur in the real world. Additionally, to provide a more comprehensive testbed for evaluating the generalization performance, OES encompasses five data domains with significant covariate shifts, including two RGB satellite domains, one RGB aerial domain, one multispectral RGB domain, and one infrared domain. We evaluate the baselines and existing methods for diverse tasks on OES, demonstrating that it serves as a meaningful and challenging benchmark for open-world remote sensing. The proposed dataset OES is available at https://haiv-lab.github.io/OES.

CVApr 2, 2024
AI WALKUP: A Computer-Vision Approach to Quantifying MDS-UPDRS in Parkinson's Disease

Xiang Xiang, Zihan Zhang, Jing Ma et al.

Parkinson's Disease (PD) is the second most common neurodegenerative disorder. The existing assessment method for PD is usually the Movement Disorder Society - Unified Parkinson's Disease Rating Scale (MDS-UPDRS) to assess the severity of various types of motor symptoms and disease progression. However, manual assessment suffers from high subjectivity, lack of consistency, and high cost and low efficiency of manual communication. We want to use a computer vision based solution to capture human pose images based on a camera, reconstruct and perform motion analysis using algorithms, and extract the features of the amount of motion through feature engineering. The proposed approach can be deployed on different smartphones, and the video recording and artificial intelligence analysis can be done quickly and easily through our APP.

ROOct 25, 2025
Bridging Perception and Reasoning: Dual-Pipeline Neuro-Symbolic Landing for UAVs in Cluttered Environments

Weixian Qian, Sebastian Schroder, Yao Deng et al.

Autonomous landing in unstructured (cluttered, uneven, and map-poor) environments is a core requirement for Unmanned Aerial Vehicles (UAVs), yet purely vision-based or deep learning models often falter under covariate shift and provide limited interpretability. We propose NeuroSymLand, a neuro-symbolic framework that tightly couples two complementary pipelines: (i) an offline pipeline, where Large Language Models (LLMs) and human-in-the-loop refinement synthesize Scallop code from diverse landing scenarios, distilling generalizable and verifiable symbolic knowledge; and (ii) an online pipeline, where a compact foundation-based semantic segmentation model generates probabilistic Scallop facts that are composed into semantic scene graphs for real-time deductive reasoning. This design combines the perceptual strengths of lightweight foundation models with the interpretability and verifiability of symbolic reasoning. Node attributes (e.g., flatness, area) and edge relations (adjacency, containment, proximity) are computed with geometric routines rather than learned, avoiding the data dependence and latency of train-time graph builders. The resulting Scallop program encodes landing principles (avoid water and obstacles; prefer large, flat, accessible regions) and yields calibrated safety scores with ranked Regions of Interest (ROIs) and human-readable justifications. Extensive evaluations across datasets, diverse simulation maps, and real UAV hardware show that NeuroSymLand achieves higher accuracy, stronger robustness to covariate shift, and superior efficiency compared with state-of-the-art baselines, while advancing UAV safety and reliability in emergency response, surveillance, and delivery missions.

CVOct 22, 2025
HAD: Hierarchical Asymmetric Distillation to Bridge Spatio-Temporal Gaps in Event-Based Object Tracking

Yao Deng, Xian Zhong, Wenxuan Liu et al.

RGB cameras excel at capturing rich texture details with high spatial resolution, whereas event cameras offer exceptional temporal resolution and a high dynamic range (HDR). Leveraging their complementary strengths can substantially enhance object tracking under challenging conditions, such as high-speed motion, HDR environments, and dynamic background interference. However, a significant spatio-temporal asymmetry exists between these two modalities due to their fundamentally different imaging mechanisms, hindering effective multi-modal integration. To address this issue, we propose {Hierarchical Asymmetric Distillation} (HAD), a multi-modal knowledge distillation framework that explicitly models and mitigates spatio-temporal asymmetries. Specifically, HAD proposes a hierarchical alignment strategy that minimizes information loss while maintaining the student network's computational efficiency and parameter compactness. Extensive experiments demonstrate that HAD consistently outperforms state-of-the-art methods, and comprehensive ablation studies further validate the effectiveness and necessity of each designed component. The code will be released soon.

CVMay 25, 2023
CUEING: a lightweight model to Capture hUman attEntion In driviNG

Linfeng Liang, Yao Deng, Yang Zhang et al.

Discrepancies in decision-making between Autonomous Driving Systems (ADS) and human drivers underscore the need for intuitive human gaze predictors to bridge this gap, thereby improving user trust and experience. Existing gaze datasets, despite their value, suffer from noise that hampers effective training. Furthermore, current gaze prediction models exhibit inconsistency across diverse scenarios and demand substantial computational resources, restricting their on-board deployment in autonomous vehicles. We propose a novel adaptive cleansing technique for purging noise from existing gaze datasets, coupled with a robust, lightweight convolutional self-attention gaze prediction model. Our approach not only significantly enhances model generalizability and performance by up to 12.13% but also ensures a remarkable reduction in model complexity by up to 98.2% compared to the state-of-the art, making in-vehicle deployment feasible to augment ADS decision visualization and performance.

SEJun 23, 2021
Testing of Autonomous Driving Systems: Where Are We and Where Should We Go?

Guannan Lou, Yao Deng, Xi Zheng et al.

Autonomous driving has shown great potential to reform modern transportation. Yet its reliability and safety have drawn a lot of attention and concerns. Compared with traditional software systems, autonomous driving systems (ADSs) often use deep neural networks in tandem with logic-based modules. This new paradigm poses unique challenges for software testing. Despite the recent development of new ADS testing techniques, it is not clear to what extent those techniques have addressed the needs of ADS practitioners. To fill this gap, we present the first comprehensive study to identify the current practices and needs of ADS testing. We conducted semi-structured interviews with developers from 10 autonomous driving companies and surveyed 100 developers who have worked on autonomous driving systems. A systematic analysis of the interview and survey data revealed 7 common practices and 4 emerging needs of autonomous driving testing. Through a comprehensive literature review, we developed a taxonomy of existing ADS testing techniques and analyzed the gap between ADS research and practitioners' needs. Finally, we proposed several future directions for SE researchers, such as developing test reduction techniques to accelerate simulation-based ADS testing.

LGApr 5, 2021
Deep Learning-Based Autonomous Driving Systems: A Survey of Attacks and Defenses

Yao Deng, Tiehua Zhang, Guannan Lou et al.

The rapid development of artificial intelligence, especially deep learning technology, has advanced autonomous driving systems (ADSs) by providing precise control decisions to counterpart almost any driving event, spanning from anti-fatigue safe driving to intelligent route planning. However, ADSs are still plagued by increasing threats from different attacks, which could be categorized into physical attacks, cyberattacks and learning-based adversarial attacks. Inevitably, the safety and security of deep learning-based autonomous driving are severely challenged by these attacks, from which the countermeasures should be analyzed and studied comprehensively to mitigate all potential risks. This survey provides a thorough analysis of different attacks that may jeopardize ADSs, as well as the corresponding state-of-the-art defense mechanisms. The analysis is unrolled by taking an in-depth overview of each step in the ADS workflow, covering adversarial attacks for various deep learning models and attacks in both physical and cyber context. Furthermore, some promising research directions are suggested in order to improve deep learning-based autonomous driving safety, including model robustness training, model testing and verification, and anomaly detection based on cloud/edge servers.

LGMar 12, 2021
SCEI: A Smart-Contract Driven Edge Intelligence Framework for IoT Systems

Chenhao Xu, Jiaqi Ge, Yong Li et al.

Federated learning (FL) enables collaborative training of a shared model on edge devices while maintaining data privacy. FL is effective when dealing with independent and identically distributed (iid) datasets, but struggles with non-iid datasets. Various personalized approaches have been proposed, but such approaches fail to handle underlying shifts in data distribution, such as data distribution skew commonly observed in real-world scenarios (e.g., driver behavior in smart transportation systems changing across time and location). Additionally, trust concerns among unacquainted devices and security concerns with the centralized aggregator pose additional challenges. To address these challenges, this paper presents a dynamically optimized personal deep learning scheme based on blockchain and federated learning. Specifically, the innovative smart contract implemented in the blockchain allows distributed edge devices to reach a consensus on the optimal weights of personalized models. Experimental evaluations using multiple models and real-world datasets demonstrate that the proposed scheme achieves higher accuracy and faster convergence compared to traditional federated and personalized learning approaches.

SEDec 19, 2020
A Declarative Metamorphic Testing Framework for Autonomous Driving

Yao Deng, Xi Zheng, Tianyi Zhang et al.

Autonomous driving has gained much attention from both industry and academia. Currently, Deep Neural Networks (DNNs) are widely used for perception and control in autonomous driving. However, several fatal accidents caused by autonomous vehicles have raised serious safety concerns about autonomous driving models. Some recent studies have successfully used the metamorphic testing technique to detect thousands of potential issues in some popularly used autonomous driving models. However, prior study is limited to a small set of metamorphic relations, which do not reflect rich, real-world traffic scenarios and are also not customizable. This paper presents a novel declarative rule-based metamorphic testing framework called RMT. RMT provides a rule template with natural language syntax, allowing users to flexibly specify an enriched set of testing scenarios based on real-world traffic rules and domain knowledge. RMT automatically parses human-written rules to metamorphic relations using an NLP-based rule parser referring to an ontology list and generates test cases with a variety of image transformation engines. We evaluated RMT on three autonomous driving models. With an enriched set of metamorphic relations, RMT detected a significant number of abnormal model predictions that were not detected by prior work. Through a large-scale human study on Amazon Mechanical Turk, we further confirmed the authenticity of test cases generated by RMT and the validity of detected abnormal model predictions.

SPFeb 6, 2020
An Analysis of Adversarial Attacks and Defenses on Autonomous Driving Models

Yao Deng, Xi Zheng, Tianyi Zhang et al.

Nowadays, autonomous driving has attracted much attention from both industry and academia. Convolutional neural network (CNN) is a key component in autonomous driving, which is also increasingly adopted in pervasive computing such as smartphones, wearable devices, and IoT networks. Prior work shows CNN-based classification models are vulnerable to adversarial attacks. However, it is uncertain to what extent regression models such as driving models are vulnerable to adversarial attacks, the effectiveness of existing defense techniques, and the defense implications for system and middleware builders. This paper presents an in-depth analysis of five adversarial attacks and four defense methods on three driving models. Experiments show that, similar to classification models, these models are still highly vulnerable to adversarial attacks. This poses a big security threat to autonomous driving and thus should be taken into account in practice. While these defense methods can effectively defend against different attacks, none of them are able to provide adequate protection against all five attacks. We derive several implications for system and middleware builders: (1) when adding a defense component against adversarial attacks, it is important to deploy multiple defense methods in tandem to achieve a good coverage of various attacks, (2) a blackbox attack is much less effective compared with a white-box attack, implying that it is important to keep model details (e.g., model architecture, hyperparameters) confidential via model obfuscation, and (3) driving models with a complex architecture are preferred if computing resources permit as they are more resilient to adversarial attacks than simple models.

CRJul 18, 2019
Towards a Multi-Chain Future of Proof-of-Space

Shuyang Tang, Jilai Zheng, Yao Deng et al.

Proof-of-Space provides an intriguing alternative for consensus protocol of permissionless blockchains due to its recyclable nature and the potential to support multiple chains simultaneously. However, a direct shared proof of the same storage, which was adopted in the existing multi-chain schemes based on Proof-of-Space, could give rise to newborn attack on new chain launching. To fix this gap, we propose an innovative framework of single-chain Proof-of-Space and further present a novel multi-chain scheme which can resist newborn attack effectively by elaborately combining shared proof and chain-specific proof of storage. Moreover, we analyze the security of the multi-chain scheme and prove that it is incentive-compatible. This means that participants in such multi-chain system can achieve their greatest utility with our proposed strategy of storage resource partition.

CRJul 17, 2019
An Overview of Attacks and Defences on Intelligent Connected Vehicles

Mahdi Dibaei, Xi Zheng, Kun Jiang et al.

Cyber security is one of the most significant challenges in connected vehicular systems and connected vehicles are prone to different cybersecurity attacks that endanger passengers' safety. Cyber security in intelligent connected vehicles is composed of in-vehicle security and security of inter-vehicle communications. Security of Electronic Control Units (ECUs) and the Control Area Network (CAN) bus are the most significant parts of in-vehicle security. Besides, with the development of 4G LTE and 5G remote communication technologies for vehicle-toeverything (V2X) communications, the security of inter-vehicle communications is another potential problem. After giving a short introduction to the architecture of next-generation vehicles including driverless and intelligent vehicles, this review paper identifies a few major security attacks on the intelligent connected vehicles. Based on these attacks, we provide a comprehensive survey of available defences against these attacks and classify them into four categories, i.e. cryptography, network security, software vulnerability detection, and malware detection. We also explore the future directions for preventing attacks on intelligent vehicle systems.