Chenhao Xu

h-index13

7papers

138citations

Novelty41%

AI Score31

Ranked #129,822 of 194,257 authors (top 67%)#42,928 in CV (top 73%)

7 Papers

5.8LGAug 15, 2022Code

An Efficient and Reliable Asynchronous Federated Learning Scheme for Smart Public Transportation

Chenhao Xu, Youyang Qu, Tom H. Luan et al.

Since the traffic conditions change over time, machine learning models that predict traffic flows must be updated continuously and efficiently in smart public transportation. Federated learning (FL) is a distributed machine learning scheme that allows buses to receive model updates without waiting for model training on the cloud. However, FL is vulnerable to poisoning or DDoS attacks since buses travel in public. Some work introduces blockchain to improve reliability, but the additional latency from the consensus process reduces the efficiency of FL. Asynchronous Federated Learning (AFL) is a scheme that reduces the latency of aggregation to improve efficiency, but the learning performance is unstable due to unreasonably weighted local models. To address the above challenges, this paper offers a blockchain-based asynchronous federated learning scheme with a dynamic scaling factor (DBAFL). Specifically, the novel committee-based consensus algorithm for blockchain improves reliability at the lowest possible cost of time. Meanwhile, the devised dynamic scaling factor allows AFL to assign reasonable weights to stale local models. Extensive experiments conducted on heterogeneous devices validate outperformed learning performance, efficiency, and reliability of DBAFL.

5.0CVOct 19, 2023

Deep Learning Techniques for Video Instance Segmentation: A Survey

Chenhao Xu, Chang-Tsun Li, Yongjian Hu et al.

Video instance segmentation, also known as multi-object tracking and segmentation, is an emerging computer vision research area introduced in 2019, aiming at detecting, segmenting, and tracking instances in videos simultaneously. By tackling the video instance segmentation tasks through effective analysis and utilization of visual information in videos, a range of computer vision-enabled applications (e.g., human action recognition, medical image processing, autonomous vehicle navigation, surveillance, etc) can be implemented. As deep-learning techniques take a dominant role in various computer vision areas, a plethora of deep-learning-based video instance segmentation schemes have been proposed. This survey offers a multifaceted view of deep-learning schemes for video instance segmentation, covering various architectural paradigms, along with comparisons of functional performance, model complexity, and computational overheads. In addition to the common architectural designs, auxiliary techniques for improving the performance of deep-learning models for video instance segmentation are compiled and discussed. Finally, we discuss a range of major challenges and directions for further investigations to help advance this promising research field.

8.7CVApr 8, 2024Code

HSViT: Horizontally Scalable Vision Transformer

Chenhao Xu, Chang-Tsun Li, Chee Peng Lim et al.

Due to its deficiency in prior knowledge (inductive bias), Vision Transformer (ViT) requires pre-training on large-scale datasets to perform well. Moreover, the growing layers and parameters in ViT models impede their applicability to devices with limited computing resources. To mitigate the aforementioned challenges, this paper introduces a novel horizontally scalable vision transformer (HSViT) scheme. Specifically, a novel image-level feature embedding is introduced to ViT, where the preserved inductive bias allows the model to eliminate the need for pre-training while outperforming on small datasets. Besides, a novel horizontally scalable architecture is designed, facilitating collaborative model training and inference across multiple computing devices. The experimental results depict that, without pre-training, HSViT achieves up to 10% higher top-1 accuracy than state-of-the-art schemes on small datasets, while providing existing CNN backbones up to 3.1% improvement in top-1 accuracy on ImageNet. The code is available at https://github.com/xuchenhao001/HSViT.

1.2DBApr 27, 2025Code

BQSched: A Non-intrusive Scheduler for Batch Concurrent Queries via Reinforcement Learning

Chenhao Xu, Chunyu Chen, Jinglin Peng et al.

Most large enterprises build predefined data pipelines and execute them periodically to process operational data using SQL queries for various tasks. A key issue in minimizing the overall makespan of these pipelines is the efficient scheduling of concurrent queries within the pipelines. Existing tools mainly rely on simple heuristic rules due to the difficulty of expressing the complex features and mutual influences of queries. The latest reinforcement learning (RL) based methods have the potential to capture these patterns from feedback, but it is non-trivial to apply them directly due to the large scheduling space, high sampling cost, and poor sample utilization. Motivated by these challenges, we propose BQSched, a non-intrusive Scheduler for Batch concurrent Queries via reinforcement learning. Specifically, BQSched designs an attention-based state representation to capture the complex query patterns, and proposes IQ-PPO, an auxiliary task-enhanced proximal policy optimization (PPO) algorithm, to fully exploit the rich signals of Individual Query completion in logs. Based on the RL framework above, BQSched further introduces three optimization strategies, including adaptive masking to prune the action space, scheduling gain-based query clustering to deal with large query sets, and an incremental simulator to reduce sampling cost. To our knowledge, BQSched is the first non-intrusive batch query scheduler via RL. Extensive experiments show that BQSched can significantly improve the efficiency and stability of batch query scheduling, while also achieving remarkable scalability and adaptability in both data and queries. For example, across all DBMSs and scales tested, BQSched reduces the overall makespan of batch queries on TPC-DS benchmark by an average of 34% and 13%, compared with the commonly used heuristic strategy and the adapted RL-based scheduler, respectively.

2.0CVMay 10, 2024

ODC-SA Net: Orthogonal Direction Enhancement and Scale Aware Network for Polyp Segmentation

Chenhao Xu, Yudian Zhang, Kaiye Xu et al.

Accurate polyp segmentation is crucial for the early detection and prevention of colorectal cancer. However, the existing polyp detection methods sometimes ignore multi-directional features and drastic changes in scale. To address these challenges, we design an Orthogonal Direction Enhancement and Scale Aware Network (ODC-SA Net) for polyp segmentation. The Orthogonal Direction Convolutional (ODC) block can extract multi-directional features using transposed rectangular convolution kernels through forming an orthogonal feature vector basis, which solves the issue of random feature direction changes and reduces computational load. Additionally, the Multi-scale Fusion Attention (MSFA) mechanism is proposed to emphasize scale changes in both spatial and channel dimensions, enhancing the segmentation accuracy for polyps of varying sizes. Extraction with Re-attention Module (ERA) is used to re-combinane effective features, and Structures of Shallow Reverse Attention Mechanism (SRA) is used to enhance polyp edge with low level information. A large number of experiments conducted on public datasets have demonstrated that the performance of this model is superior to state-of-the-art methods.

3.1LGMar 12, 2021

SCEI: A Smart-Contract Driven Edge Intelligence Framework for IoT Systems

Chenhao Xu, Jiaqi Ge, Yong Li et al.

Federated learning (FL) enables collaborative training of a shared model on edge devices while maintaining data privacy. FL is effective when dealing with independent and identically distributed (iid) datasets, but struggles with non-iid datasets. Various personalized approaches have been proposed, but such approaches fail to handle underlying shifts in data distribution, such as data distribution skew commonly observed in real-world scenarios (e.g., driver behavior in smart transportation systems changing across time and location). Additionally, trust concerns among unacquainted devices and security concerns with the centralized aggregator pose additional challenges. To address these challenges, this paper presents a dynamically optimized personal deep learning scheme based on blockchain and federated learning. Specifically, the innovative smart contract implemented in the blockchain allows distributed edge devices to reach a consensus on the optimal weights of personalized models. Experimental evaluations using multiple models and real-world datasets demonstrate that the proposed scheme achieves higher accuracy and faster convergence compared to traditional federated and personalized learning approaches.

4.6CVSep 27, 2016

Learning convolutional neural network to maximize Pos@Top performance measure

Yanyan Geng, Ru-Ze Liang, Weizhi Li et al.

In the machine learning problems, the performance measure is used to evaluate the machine learning models. Recently, the number positive data points ranked at the top positions (Pos@Top) has been a popular performance measure in the machine learning community. In this paper, we propose to learn a convolutional neural network (CNN) model to maximize the Pos@Top performance measure. The CNN model is used to represent the multi-instance data point, and a classifier function is used to predict the label from the its CNN representation. We propose to minimize the loss function of Pos@Top over a training set to learn the filters of CNN and the classifier parameter. The classifier parameter vector is solved by the Lagrange multiplier method, and the filters are updated by the gradient descent method alternately in an iterative algorithm. Experiments over benchmark data sets show that the proposed method outperforms the state-of-the-art Pos@Top maximization methods.