Chenguang Liu

CV
h-index32
17papers
190citations
Novelty49%
AI Score45

17 Papers

SPNov 17, 2023
Cooperative Perception with Learning-Based V2V communications

Chenguang Liu, Yunfei Chen, Jianjun Chen et al.

Cooperative perception has been widely used in autonomous driving to alleviate the inherent limitation of single automated vehicle perception. To enable cooperation, vehicle-to-vehicle (V2V) communication plays an indispensable role. This work analyzes the performance of cooperative perception accounting for communications channel impairments. Different fusion methods and channel impairments are evaluated. A new late fusion scheme is proposed to leverage the robustness of intermediate features. In order to compress the data size incurred by cooperation, a convolution neural network-based autoencoder is adopted. Numerical results demonstrate that intermediate fusion is more robust to channel impairments than early fusion and late fusion, when the SNR is greater than 0 dB. Also, the proposed fusion scheme outperforms the conventional late fusion using detection outputs, and autoencoder provides a good compromise between detection accuracy and bandwidth usage.

SPNov 23, 2023
Knowledge Distillation Based Semantic Communications For Multiple Users

Chenguang Liu, Yuxin Zhou, Yunfei Chen et al.

Deep learning (DL) has shown great potential in revolutionizing the traditional communications system. Many applications in communications have adopted DL techniques due to their powerful representation ability. However, the learning-based methods can be dependent on the training dataset and perform worse on unseen interference due to limited model generalizability and complexity. In this paper, we consider the semantic communication (SemCom) system with multiple users, where there is a limited number of training samples and unexpected interference. To improve the model generalization ability and reduce the model size, we propose a knowledge distillation (KD) based system where Transformer based encoder-decoder is implemented as the semantic encoder-decoder and fully connected neural networks are implemented as the channel encoder-decoder. Specifically, four types of knowledge transfer and model compression are analyzed. Important system and model parameters are considered, including the level of noise and interference, the number of interfering users and the size of the encoder and decoder. Numerical results demonstrate that KD significantly improves the robustness and the generalization ability when applied to unexpected interference, and it reduces the performance loss when compressing the model size.

CVApr 9, 2024Code
YOLC: You Only Look Clusters for Tiny Object Detection in Aerial Images

Chenguang Liu, Guangshuai Gao, Ziyue Huang et al.

Detecting objects from aerial images poses significant challenges due to the following factors: 1) Aerial images typically have very large sizes, generally with millions or even hundreds of millions of pixels, while computational resources are limited. 2) Small object size leads to insufficient information for effective detection. 3) Non-uniform object distribution leads to computational resource wastage. To address these issues, we propose YOLC (You Only Look Clusters), an efficient and effective framework that builds on an anchor-free object detector, CenterNet. To overcome the challenges posed by large-scale images and non-uniform object distribution, we introduce a Local Scale Module (LSM) that adaptively searches cluster regions for zooming in for accurate detection. Additionally, we modify the regression loss using Gaussian Wasserstein distance (GWD) to obtain high-quality bounding boxes. Deformable convolution and refinement methods are employed in the detection head to enhance the detection of small objects. We perform extensive experiments on two aerial image datasets, including Visdrone2019 and UAVDT, to demonstrate the effectiveness and superiority of our proposed approach. Code is available at https://github.com/dawn-ech/YOLC.

OCSep 8, 2022
Losing momentum in continuous-time stochastic optimisation

Kexin Jin, Jonas Latz, Chenguang Liu et al.

The training of modern machine learning models often consists in solving high-dimensional non-convex optimisation problems that are subject to large-scale data. In this context, momentum-based stochastic optimisation algorithms have become particularly widespread. The stochasticity arises from data subsampling which reduces computational cost. Both, momentum and stochasticity help the algorithm to converge globally. In this work, we propose and analyse a continuous-time model for stochastic gradient descent with momentum. This model is a piecewise-deterministic Markov process that represents the optimiser by an underdamped dynamical system and the data subsampling through a stochastic switching. We investigate longtime limits, the subsampling-to-no-subsampling limit, and the momentum-to-no-momentum limit. We are particularly interested in the case of reducing the momentum over time. Under convexity assumptions, we show convergence of our dynamical system to the global minimiser when reducing momentum over time and letting the subsampling rate go to infinity. We then propose a stable, symplectic discretisation scheme to construct an algorithm from our continuous-time dynamical system. In experiments, we study our scheme in convex and non-convex test problems. Additionally, we train a convolutional neural network in an image classification problem. Our algorithm {attains} competitive results compared to stochastic gradient descent with momentum.

NADec 31, 2025
Convergence of the generalization error for deep gradient flow methods for PDEs

Chenguang Liu, Antonis Papapantoleon, Jasper Rou

The aim of this article is to provide a firm mathematical foundation for the application of deep gradient flow methods (DGFMs) for the solution of (high-dimensional) partial differential equations (PDEs). We decompose the generalization error of DGFMs into an approximation and a training error. We first show that the solution of PDEs that satisfy reasonable and verifiable assumptions can be approximated by neural networks, thus the approximation error tends to zero as the number of neurons tends to infinity. Then, we derive the gradient flow that the training process follows in the ``wide network limit'' and analyze the limit of this flow as the training time tends to infinity. These results combined show that the generalization error of DGFMs tends to zero as the number of neurons and the training time tend to infinity.

LGJul 11, 2024
How to beat a Bayesian adversary

Zihan Ding, Kexin Jin, Jonas Latz et al.

Deep neural networks and other modern machine learning models are often susceptible to adversarial attacks. Indeed, an adversary may often be able to change a model's prediction through a small, directed perturbation of the model's input - an issue in safety-critical applications. Adversarially robust machine learning is usually based on a minmax optimisation problem that minimises the machine learning loss under maximisation-based adversarial attacks. In this work, we study adversaries that determine their attack using a Bayesian statistical approach rather than maximisation. The resulting Bayesian adversarial robustness problem is a relaxation of the usual minmax problem. To solve this problem, we propose Abram - a continuous-time particle system that shall approximate the gradient flow corresponding to the underlying learning problem. We show that Abram approximates a McKean-Vlasov process and justify the use of Abram by giving assumptions under which the McKean-Vlasov process finds the minimiser of the Bayesian adversarial robustness problem. We discuss two ways to discretise Abram and show its suitability in benchmark adversarial deep learning experiments.

CVApr 13, 2025Code
Vision-Language Model for Object Detection and Segmentation: A Review and Evaluation

Yongchao Feng, Yajie Liu, Shuai Yang et al.

Vision-Language Model (VLM) have gained widespread adoption in Open-Vocabulary (OV) object detection and segmentation tasks. Despite they have shown promise on OV-related tasks, their effectiveness in conventional vision tasks has thus far been unevaluated. In this work, we present the systematic review of VLM-based detection and segmentation, view VLM as the foundational model and conduct comprehensive evaluations across multiple downstream tasks for the first time: 1) The evaluation spans eight detection scenarios (closed-set detection, domain adaptation, crowded objects, etc.) and eight segmentation scenarios (few-shot, open-world, small object, etc.), revealing distinct performance advantages and limitations of various VLM architectures across tasks. 2) As for detection tasks, we evaluate VLMs under three finetuning granularities: \textit{zero prediction}, \textit{visual fine-tuning}, and \textit{text prompt}, and further analyze how different finetuning strategies impact performance under varied task. 3) Based on empirical findings, we provide in-depth analysis of the correlations between task characteristics, model architectures, and training methodologies, offering insights for future VLM design. 4) We believe that this work shall be valuable to the pattern recognition experts working in the fields of computer vision, multimodal learning, and vision foundation models by introducing them to the problem, and familiarizing them with the current status of the progress while providing promising directions for future research. A project associated with this review and evaluation has been created at https://github.com/better-chao/perceptual_abilities_evaluation.

CVDec 16, 2023
Self-supervised Adaptive Weighting for Cooperative Perception in V2V Communications

Chenguang Liu, Jianjun Chen, Yunfei Chen et al.

Perception of the driving environment is critical for collision avoidance and route planning to ensure driving safety. Cooperative perception has been widely studied as an effective approach to addressing the shortcomings of single-vehicle perception. However, the practical limitations of vehicle-to-vehicle (V2V) communications have not been adequately investigated. In particular, current cooperative fusion models rely on supervised models and do not address dynamic performance degradation caused by arbitrary channel impairments. In this paper, a self-supervised adaptive weighting model is proposed for intermediate fusion to mitigate the adverse effects of channel distortion. The performance of cooperative perception is investigated in different system settings. Rician fading and imperfect channel state information (CSI) are also considered. Numerical results demonstrate that the proposed adaptive weighting algorithm significantly outperforms the benchmarks without weighting. Visualization examples validate that the proposed weighting algorithm can flexibly adapt to various channel conditions. Moreover, the adaptive weighting algorithm demonstrates good generalization to untrained channels and test datasets from different domains.

CVApr 7, 2024
Msmsfnet: a multi-stream and multi-scale fusion net for edge detection

Chenguang Liu, Chisheng Wang, Feifei Dong et al.

Edge detection is a long-standing problem in computer vision. Despite the efficiency of existing algorithms, their performance, however, rely heavily on the pre-trained weights of the backbone network on the ImageNet dataset. The use of pre-trained weights in previous methods significantly increases the difficulty to design new models for edge detection without relying on existing well-trained ImageNet models, as pre-training the model on the ImageNet dataset is expensive and becomes compulsory to ensure the fairness of comparison. Besides, the pre-training and fine-tuning strategy is not always useful and sometimes even inaccessible. For instance, the pre-trained weights on the ImageNet dataset are unlikely to be helpful for edge detection in Synthetic Aperture Radar (SAR) images due to strong differences in the statistics between optical images and SAR images. Moreover, no dataset has comparable size to the ImageNet dataset for SAR image processing. In this work, we study the performance achievable by state-of-the-art deep learning based edge detectors in publicly available datasets when they are trained from scratch, and devise a new network architecture, the multi-stream and multi-scale fusion net (msmsfnet), for edge detection. We show in our experiments that by training all models from scratch, our model outperforms state-of-the-art edge detectors in three publicly available datasets. We also demonstrate the efficiency of our model for edge detection in SAR images, where no useful pre-trained weight is available. Finally, We show that our model is able to achieve competitive performance on the BSDS500 dataset when the pre-trained weights are used.

CVAug 27, 2025
POEv2: a flexible and robust framework for generic line segment detection and wireframe line segment detection

Chenguang Liu, Chisheng Wang, Yuhua Cai et al.

Line segment detection in images has been studied for several decades. Existing line segment detectors can be roughly divided into two categories: generic line segment detectors and wireframe line segment detectors. Generic line segment detectors aim to detect all meaningful line segments in images and traditional approaches usually fall into this category. Recent deep learning based approaches are mostly wireframe line segment detectors. They detect only line segments that are geometrically meaningful and have large spatial support. Due to the difference in the aim of design, the performance of generic line segment detectors for the task of wireframe line segment detection won't be satisfactory, and vice versa. In this work, we propose a robust framework that can be used for both generic line segment detection and wireframe line segment detection. The proposed method is an improved version of the Pixel Orientation Estimation (POE) method. It is thus named as POEv2. POEv2 detects line segments from edge strength maps, and can be combined with any edge detector. We show in our experiments that by combining the proposed POEv2 with an efficient edge detector, it achieves state-of-the-art performance on three publicly available datasets.

CVJul 1, 2025
De-Simplifying Pseudo Labels to Enhancing Domain Adaptive Object Detection

Zehua Fu, Chenguang Liu, Yuyu Chen et al.

Despite its significant success, object detection in traffic and transportation scenarios requires time-consuming and laborious efforts in acquiring high-quality labeled data. Therefore, Unsupervised Domain Adaptation (UDA) for object detection has recently gained increasing research attention. UDA for object detection has been dominated by domain alignment methods, which achieve top performance. Recently, self-labeling methods have gained popularity due to their simplicity and efficiency. In this paper, we investigate the limitations that prevent self-labeling detectors from achieving commensurate performance with domain alignment methods. Specifically, we identify the high proportion of simple samples during training, i.e., the simple-label bias, as the central cause. We propose a novel approach called De-Simplifying Pseudo Labels (DeSimPL) to mitigate the issue. DeSimPL utilizes an instance-level memory bank to implement an innovative pseudo label updating strategy. Then, adversarial samples are introduced during training to enhance the proportion. Furthermore, we propose an adaptive weighted loss to avoid the model suffering from an abundance of false positive pseudo labels in the late training period. Experimental results demonstrate that DeSimPL effectively reduces the proportion of simple samples during training, leading to a significant performance improvement for self-labeling detectors. Extensive experiments conducted on four benchmarks validate our analysis and conclusions.

CVMay 6, 2025
Coop-WD: Cooperative Perception with Weighting and Denoising for Robust V2V Communication

Chenguang Liu, Jianjun Chen, Yunfei Chen et al.

Cooperative perception, leveraging shared information from multiple vehicles via vehicle-to-vehicle (V2V) communication, plays a vital role in autonomous driving to alleviate the limitation of single-vehicle perception. Existing works have explored the effects of V2V communication impairments on perception precision, but they lack generalization to different levels of impairments. In this work, we propose a joint weighting and denoising framework, Coop-WD, to enhance cooperative perception subject to V2V channel impairments. In this framework, the self-supervised contrastive model and the conditional diffusion probabilistic model are adopted hierarchically for vehicle-level and pixel-level feature enhancement. An efficient variant model, Coop-WD-eco, is proposed to selectively deactivate denoising to reduce processing overhead. Rician fading, non-stationarity, and time-varying distortion are considered. Simulation results demonstrate that the proposed Coop-WD outperforms conventional benchmarks in all types of channels. Qualitative analysis with visual examples further proves the superiority of our proposed method. The proposed Coop-WD-eco achieves up to 50% reduction in computational cost under severe distortion while maintaining comparable accuracy as channel conditions improve.

CVJan 15, 2025
PACF: Prototype Augmented Compact Features for Improving Domain Adaptive Object Detection

Chenguang Liu, Yongchao Feng, Yanan Zhang et al.

In recent years, there has been significant advancement in object detection. However, applying off-the-shelf detectors to a new domain leads to significant performance drop, caused by the domain gap. These detectors exhibit higher-variance class-conditional distributions in the target domain than that in the source domain, along with mean shift. To address this problem, we propose the Prototype Augmented Compact Features (PACF) framework to regularize the distribution of intra-class features. Specifically, we provide an in-depth theoretical analysis on the lower bound of the target features-related likelihood and derive the prototype cross entropy loss to further calibrate the distribution of target RoI features. Furthermore, a mutual regularization strategy is designed to enable the linear and prototype-based classifiers to learn from each other, promoting feature compactness while enhancing discriminability. Thanks to this PACF framework, we have obtained a more compact cross-domain feature space, within which the variance of the target features' class-conditional distributions has significantly decreased, and the class-mean shift between the two domains has also been further reduced. The results on different adaptation settings are state-of-the-art, which demonstrate the board applicability and effectiveness of the proposed approach.

MLMay 23, 2023
Subsampling Error in Stochastic Gradient Langevin Diffusions

Kexin Jin, Chenguang Liu, Jonas Latz

The Stochastic Gradient Langevin Dynamics (SGLD) are popularly used to approximate Bayesian posterior distributions in statistical learning procedures with large-scale data. As opposed to many usual Markov chain Monte Carlo (MCMC) algorithms, SGLD is not stationary with respect to the posterior distribution; two sources of error appear: The first error is introduced by an Euler--Maruyama discretisation of a Langevin diffusion process, the second error comes from the data subsampling that enables its use in large-scale data settings. In this work, we consider an idealised version of SGLD to analyse the method's pure subsampling error that we then see as a best-case error for diffusion-based subsampling MCMC methods. Indeed, we introduce and study the Stochastic Gradient Langevin Diffusion (SGLDiff), a continuous-time Markov process that follows the Langevin diffusion corresponding to a data subset and switches this data subset after exponential waiting times. There, we show the exponential ergodicity of SLGDiff and that the Wasserstein distance between the posterior and the limiting distribution of SGLDiff is bounded above by a fractional power of the mean waiting time. We bring our results into context with other analyses of SGLD.

LGDec 7, 2021
A Continuous-time Stochastic Gradient Descent Method for Continuous Data

Kexin Jin, Jonas Latz, Chenguang Liu et al.

Optimization problems with continuous data appear in, e.g., robust machine learning, functional data analysis, and variational inference. Here, the target function is given as an integral over a family of (continuously) indexed target functions - integrated with respect to a probability measure. Such problems can often be solved by stochastic optimization methods: performing optimization steps with respect to the indexed target function with randomly switched indices. In this work, we study a continuous-time variant of the stochastic gradient descent algorithm for optimization problems with continuous data. This so-called stochastic gradient process consists in a gradient flow minimizing an indexed target function that is coupled with a continuous-time index process determining the index. Index processes are, e.g., reflected diffusions, pure jump processes, or other Lévy processes on compact spaces. Thus, we study multiple sampling patterns for the continuous data space and allow for data simulated or streamed at runtime of the algorithm. We analyze the approximation properties of the stochastic gradient process and study its longtime behavior and ergodicity under constant and decreasing learning rates. We end with illustrating the applicability of the stochastic gradient process in a polynomial regression problem with noisy functional data, as well as in a physics-informed neural network.

CROct 15, 2020
Progressive Defense Against Adversarial Attacks for Deep Learning as a Service in Internet of Things

Ling Wang, Cheng Zhang, Zejian Luo et al.

Nowadays, Deep Learning as a service can be deployed in Internet of Things (IoT) to provide smart services and sensor data processing. However, recent research has revealed that some Deep Neural Networks (DNN) can be easily misled by adding relatively small but adversarial perturbations to the input (e.g., pixel mutation in input images). One challenge in defending DNN against these attacks is to efficiently identifying and filtering out the adversarial pixels. The state-of-the-art defense strategies with good robustness often require additional model training for specific attacks. To reduce the computational cost without loss of generality, we present a defense strategy called a progressive defense against adversarial attacks (PDAAA) for efficiently and effectively filtering out the adversarial pixel mutations, which could mislead the neural network towards erroneous outputs, without a-priori knowledge about the attack type. We evaluated our progressive defense strategy against various attack methods on two well-known datasets. The result shows it outperforms the state-of-the-art while reducing the cost of model training by 50% on average.

CYNov 1, 2019
rIoT: Enabling Seamless Context-Aware Automation in the Internet of Things

Jie Hua, Chenguang Liu, Tomasz Kalbarczyk et al.

Advances in mobile computing capabilities and an increasing number of Internet of Things (IoT) devices have enriched the possibilities of the IoT but have also increased the cognitive load required of IoT users. Existing context-aware systems provide various levels of automation in the IoT. Many of these systems adaptively take decisions on how to provide services based on assumptions made a priori. The approaches are difficult to personalize to an individual's dynamic environment, and thus today's smart IoT spaces often demand complex and specialized interactions with the user in order to provide tailored services. We propose rIoT, a framework for seamless and personalized automation of human-device interaction in the IoT. rIoT leverages existing technologies to operate across heterogeneous devices and networks to provide a one-stop solution for device interaction in the IoT. We show how rIoT exploits similarities between contexts and employs a decision-tree like method to adaptively capture a user's preferences from a small number of interactions with the IoT space. We measure the performance of rIoT on two real-world data sets and a real mobile device in terms of accuracy, learning speed, and latency in comparison to two state-of-the-art machine learning algorithms.