Chengjun Liu

CV
h-index17
14papers
309citations
Novelty43%
AI Score50

14 Papers

CVAug 12, 2022Code
Real-Time Accident Detection in Traffic Surveillance Using Deep Learning

Hadi Ghahremannezhad, Hang Shi, Chengjun Liu

Automatic detection of traffic accidents is an important emerging topic in traffic monitoring systems. Nowadays many urban intersections are equipped with surveillance cameras connected to traffic management systems. Therefore, computer vision techniques can be viable tools for automatic accident detection. This paper presents a new efficient framework for accident detection at intersections for traffic surveillance applications. The proposed framework consists of three hierarchical steps, including efficient and accurate object detection based on the state-of-the-art YOLOv4 method, object tracking based on Kalman filter coupled with the Hungarian algorithm for association, and accident detection by trajectory conflict analysis. A new cost function is applied for object association to accommodate for occlusion, overlapping objects, and shape changes in the object tracking step. The object trajectories are analyzed in terms of velocity, angle, and distance in order to detect different types of trajectory conflicts including vehicle-to-vehicle, vehicle-to-pedestrian, and vehicle-to-bicycle. Experimental results using real traffic video data show the feasibility of the proposed method in real-time applications of traffic surveillance. In particular, trajectory conflicts, including near-accidents and accidents occurring at urban intersections are detected with a low false alarm rate and a high detection rate. The robustness of the proposed framework is evaluated using video sequences collected from YouTube with diverse illumination conditions. The dataset is publicly available at: http://github.com/hadi-ghnd/AccidentDetection.

CVAug 26, 2022Code
Ammunition Component Classification Using Deep Learning

Hadi Ghahremannezhad, Chengjun Liu, Hang Shi

Ammunition scrap inspection is an essential step in the process of recycling ammunition metal scrap. Most ammunition is composed of a number of components, including case, primer, powder, and projectile. Ammo scrap containing energetics is considered to be potentially dangerous and should be separated before the recycling process. Manually inspecting each piece of scrap is tedious and time-consuming. We have gathered a dataset of ammunition components with the goal of applying artificial intelligence for classifying safe and unsafe scrap pieces automatically. First, two training datasets are manually created from visual and x-ray images of ammo. Second, the x-ray dataset is augmented using the spatial transforms of histogram equalization, averaging, sharpening, power law, and Gaussian blurring in order to compensate for the lack of sufficient training data. Lastly, the representative YOLOv4 object detection method is applied to detect the ammo components and classify the scrap pieces into safe and unsafe classes, respectively. The trained models are tested against unseen data in order to evaluate the performance of the applied method. The experiments demonstrate the feasibility of ammo component detection and classification using deep learning. The datasets and the pre-trained models are available at https://github.com/hadi-ghnd/Scrap-Classification.

CLMay 21, 2025
Hunyuan-TurboS: Advancing Large Language Models through Mamba-Transformer Synergy and Adaptive Chain-of-Thought

Tencent Hunyuan Team, Ao Liu, Botong Zhou et al. · tencent-ai

As Large Language Models (LLMs) rapidly advance, we introduce Hunyuan-TurboS, a novel large hybrid Transformer-Mamba Mixture of Experts (MoE) model. It synergistically combines Mamba's long-sequence processing efficiency with Transformer's superior contextual understanding. Hunyuan-TurboS features an adaptive long-short chain-of-thought (CoT) mechanism, dynamically switching between rapid responses for simple queries and deep "thinking" modes for complex problems, optimizing computational resources. Architecturally, this 56B activated (560B total) parameter model employs 128 layers (Mamba2, Attention, FFN) with an innovative AMF/MF block pattern. Faster Mamba2 ensures linear complexity, Grouped-Query Attention minimizes KV cache, and FFNs use an MoE structure. Pre-trained on 16T high-quality tokens, it supports a 256K context length and is the first industry-deployed large-scale Mamba model. Our comprehensive post-training strategy enhances capabilities via Supervised Fine-Tuning (3M instructions), a novel Adaptive Long-short CoT Fusion method, Multi-round Deliberation Learning for iterative improvement, and a two-stage Large-scale Reinforcement Learning process targeting STEM and general instruction-following. Evaluations show strong performance: overall top 7 rank on LMSYS Chatbot Arena with a score of 1356, outperforming leading models like Gemini-2.0-Flash-001 (1352) and o4-mini-2025-04-16 (1345). TurboS also achieves an average of 77.9% across 23 automated benchmarks. Hunyuan-TurboS balances high performance and efficiency, offering substantial capabilities at lower inference costs than many reasoning models, establishing a new paradigm for efficient large-scale pre-trained models.

CVNov 11, 2024Code
Increasing Rosacea Awareness Among Population Using Deep Learning and Statistical Approaches

Chengyu Yang, Chengjun Liu

Approximately 16 million Americans suffer from rosacea according to the National Rosacea Society. To increase rosacea awareness, automatic rosacea detection methods using deep learning and explainable statistical approaches are presented in this paper. The deep learning method applies the ResNet-18 for rosacea detection, and the statistical approaches utilize the means of the two classes, namely, the rosacea class vs. the normal class, and the principal component analysis to extract features from the facial images for automatic rosacea detection. The contributions of the proposed methods are three-fold. First, the proposed methods are able to automatically distinguish patients who are suffering from rosacea from people who are clean of this disease. Second, the statistical approaches address the explainability issue that allows doctors and patients to understand and trust the results. And finally, the proposed methods will not only help increase rosacea awareness in the general population but also help remind the patients who suffer from this disease of possible early treatment since rosacea is more treatable at its early stages. The code and data are available at https://github.com/chengyuyang-njit/rosacea_detection.git.

IVApr 10, 2025Code
Interpretable Automatic Rosacea Detection with Whitened Cosine Similarity

Chengyu Yang, Chengjun Liu

According to the National Rosacea Society, approximately sixteen million Americans suffer from rosacea, a common skin condition that causes flushing or long-term redness on a person's face. To increase rosacea awareness and to better assist physicians to make diagnosis on this disease, we propose an interpretable automatic rosacea detection method based on whitened cosine similarity in this paper. The contributions of the proposed methods are three-fold. First, the proposed method can automatically distinguish patients suffering from rosacea from people who are clean of this disease with a significantly higher accuracy than other methods in unseen test data, including both classical deep learning and statistical methods. Second, the proposed method addresses the interpretability issue by measuring the similarity between the test sample and the means of two classes, namely the rosacea class versus the normal class, which allows both medical professionals and patients to understand and trust the results. And finally, the proposed methods will not only help increase awareness of rosacea in the general population, but will also help remind patients who suffer from this disease of possible early treatment, as rosacea is more treatable in its early stages. The code and data are available at https://github.com/chengyuyang-njit/ICCRD-2025. The code and data are available at https://github.com/chengyuyang-njit/ICCRD-2025.

IVMay 27, 2025Code
Laparoscopic Image Desmoking Using the U-Net with New Loss Function and Integrated Differentiable Wiener Filter

Chengyu Yang, Chengjun Liu

Laparoscopic surgeries often suffer from reduced visual clarity due to the presence of surgical smoke originated by surgical instruments, which poses significant challenges for both surgeons and vision based computer-assisted technologies. In order to remove the surgical smoke, a novel U-Net deep learning with new loss function and integrated differentiable Wiener filter (ULW) method is presented. Specifically, the new loss function integrates the pixel, structural, and perceptual properties. Thus, the new loss function, which combines the structural similarity index measure loss, the perceptual loss, as well as the mean squared error loss, is able to enhance the quality and realism of the reconstructed images. Furthermore, the learnable Wiener filter is capable of effectively modelling the degradation process caused by the surgical smoke. The effectiveness of the proposed ULW method is evaluated using the publicly available paired laparoscopic smoke and smoke-free image dataset, which provides reliable benchmarking and quantitative comparisons. Experimental results show that the proposed ULW method excels in both visual clarity and metric-based evaluation. As a result, the proposed ULW method offers a promising solution for real-time enhancement of laparoscopic imagery. The code is available at https://github.com/chengyuyang-njit/ImageDesmoke.

LGNov 10, 2021Code
Persia: An Open, Hybrid System Scaling Deep Learning-based Recommenders up to 100 Trillion Parameters

Xiangru Lian, Binhang Yuan, Xuefeng Zhu et al.

Deep learning based models have dominated the current landscape of production recommender systems. Furthermore, recent years have witnessed an exponential growth of the model scale--from Google's 2016 model with 1 billion parameters to the latest Facebook's model with 12 trillion parameters. Significant quality boost has come with each jump of the model capacity, which makes us believe the era of 100 trillion parameters is around the corner. However, the training of such models is challenging even within industrial scale data centers. This difficulty is inherited from the staggering heterogeneity of the training computation--the model's embedding layer could include more than 99.99% of the total model size, which is extremely memory-intensive; while the rest neural network is increasingly computation-intensive. To support the training of such huge models, an efficient distributed training system is in urgent need. In this paper, we resolve this challenge by careful co-design of both the optimization algorithm and the distributed system architecture. Specifically, in order to ensure both the training efficiency and the training accuracy, we design a novel hybrid training algorithm, where the embedding layer and the dense neural network are handled by different synchronization mechanisms; then we build a system called Persia (short for parallel recommendation training system with hybrid acceleration) to support this hybrid training algorithm. Both theoretical demonstration and empirical study up to 100 trillion parameters have conducted to justified the system design and implementation of Persia. We make Persia publicly available (at https://github.com/PersiaML/Persia) so that anyone would be able to easily train a recommender model at the scale of 100 trillion parameters.

LGMay 23, 2024
Surge Phenomenon in Optimal Learning Rate and Batch Size Scaling

Shuaipeng Li, Penghao Zhao, Hailin Zhang et al.

In current deep learning tasks, Adam style optimizers such as Adam, Adagrad, RMSProp, Adafactor, and Lion have been widely used as alternatives to SGD style optimizers. These optimizers typically update model parameters using the sign of gradients, resulting in more stable convergence curves. The learning rate and the batch size are the most critical hyperparameters for optimizers, which require careful tuning to enable effective convergence. Previous research has shown that the optimal learning rate increases linearly or follows similar rules with batch size for SGD style optimizers. However, this conclusion is not applicable to Adam style optimizers. In this paper, we elucidate the connection between optimal learning rates and batch sizes for Adam style optimizers through both theoretical analysis and extensive experiments. First, we raise the scaling law between batch sizes and optimal learning rates in the sign of gradient case, in which we prove that the optimal learning rate first rises and then falls as the batch size increases. Moreover, the peak value of the surge will gradually move toward the larger batch size as training progresses. Second, we conducted experiments on various CV and NLP tasks and verified the correctness of the scaling law.

CLJun 3, 2025
LLMs Can Also Do Well! Breaking Barriers in Semantic Role Labeling via Large Language Models

Xinxin Li, Huiyao Chen, Chengjun Liu et al.

Semantic role labeling (SRL) is a crucial task of natural language processing (NLP). Although generative decoder-based large language models (LLMs) have achieved remarkable success across various NLP tasks, they still lag behind state-of-the-art encoder-decoder (BERT-like) models in SRL. In this work, we seek to bridge this gap by equipping LLMs for SRL with two mechanisms: (a) retrieval-augmented generation and (b) self-correction. The first mechanism enables LLMs to leverage external linguistic knowledge such as predicate and argument structure descriptions, while the second allows LLMs to identify and correct inconsistent SRL outputs. We conduct extensive experiments on three widely-used benchmarks of SRL (CPB1.0, CoNLL-2009, and CoNLL-2012). Results demonstrate that our method achieves state-of-the-art performance in both Chinese and English, marking the first successful application of LLMs to surpass encoder-decoder approaches in SRL.

CVSep 11, 2025
Investigating the Impact of Various Loss Functions and Learnable Wiener Filter for Laparoscopic Image Desmoking

Chengyu Yang, Chengjun Liu

To rigorously assess the effectiveness and necessity of individual components within the recently proposed ULW framework for laparoscopic image desmoking, this paper presents a comprehensive ablation study. The ULW approach combines a U-Net based backbone with a compound loss function that comprises mean squared error (MSE), structural similarity index (SSIM) loss, and perceptual loss. The framework also incorporates a differentiable, learnable Wiener filter module. In this study, each component is systematically ablated to evaluate its specific contribution to the overall performance of the whole framework. The analysis includes: (1) removal of the learnable Wiener filter, (2) selective use of individual loss terms from the composite loss function. All variants are benchmarked on a publicly available paired laparoscopic images dataset using quantitative metrics (SSIM, PSNR, MSE and CIEDE-2000) alongside qualitative visual comparisons.

CVSep 11, 2025
Privacy-Preserving Automated Rosacea Detection Based on Medically Inspired Region of Interest Selection

Chengyu Yang, Rishik Reddy Yesgari, Chengjun Liu

Rosacea is a common but underdiagnosed inflammatory skin condition that primarily affects the central face and presents with subtle redness, pustules, and visible blood vessels. Automated detection remains challenging due to the diffuse nature of symptoms, the scarcity of labeled datasets, and privacy concerns associated with using identifiable facial images. A novel privacy-preserving automated rosacea detection method inspired by clinical priors and trained entirely on synthetic data is presented in this paper. Specifically, the proposed method, which leverages the observation that rosacea manifests predominantly through central facial erythema, first constructs a fixed redness-informed mask by selecting regions with consistently high red channel intensity across facial images. The mask thus is able to focus on diagnostically relevant areas such as the cheeks, nose, and forehead and exclude identity-revealing features. Second, the ResNet-18 deep learning method, which is trained on the masked synthetic images, achieves superior performance over the full-face baselines with notable gains in terms of accuracy, recall and F1 score when evaluated using the real-world test data. The experimental results demonstrate that the synthetic data and clinical priors can jointly enable accurate and ethical dermatological AI systems, especially for privacy sensitive applications in telemedicine and large-scale screening.

CVSep 11, 2025
Patch-based Automatic Rosacea Detection Using the ResNet Deep Learning Framework

Chengyu Yang, Rishik Reddy Yesgari, Chengjun Liu

Rosacea, which is a chronic inflammatory skin condition that manifests with facial redness, papules, and visible blood vessels, often requirs precise and early detection for significantly improving treatment effectiveness. This paper presents new patch-based automatic rosacea detection strategies using the ResNet-18 deep learning framework. The contributions of the proposed strategies come from the following aspects. First, various image pateches are extracted from the facial images of people in different sizes, shapes, and locations. Second, a number of investigation studies are carried out to evaluate how the localized visual information influences the deep learing model performance. Third, thorough experiments are implemented to reveal that several patch-based automatic rosacea detection strategies achieve competitive or superior accuracy and sensitivity than the full-image based methods. And finally, the proposed patch-based strategies, which use only localized patches, inherently preserve patient privacy by excluding any identifiable facial features from the data. The experimental results indicate that the proposed patch-based strategies guide the deep learning model to focus on clinically relevant regions, enhance robustness and interpretability, and protect patient privacy. As a result, the proposed strategies offer practical insights for improving automated dermatological diagnostics.

CVSep 7, 2021
Smart Traffic Monitoring System using Computer Vision and Edge Computing

Guanxiong Liu, Hang Shi, Abbas Kiani et al.

Traffic management systems capture tremendous video data and leverage advances in video processing to detect and monitor traffic incidents. The collected data are traditionally forwarded to the traffic management center (TMC) for in-depth analysis and may thus exacerbate the network paths to the TMC. To alleviate such bottlenecks, we propose to utilize edge computing by equipping edge nodes that are close to cameras with computing resources (e.g. cloudlets). A cloudlet, with limited computing resources as compared to TMC, provides limited video processing capabilities. In this paper, we focus on two common traffic monitoring tasks, congestion detection, and speed detection, and propose a two-tier edge computing based model that takes into account of both the limited computing capability in cloudlets and the unstable network condition to the TMC. Our solution utilizes two algorithms for each task, one implemented at the edge and the other one at the TMC, which are designed with the consideration of different computing resources. While the TMC provides strong computation power, the video quality it receives depends on the underlying network conditions. On the other hand, the edge processes very high-quality video but with limited computing resources. Our model captures this trade-off. We evaluate the performance of the proposed two-tier model as well as the traffic monitoring algorithms via test-bed experiments under different weather as well as network conditions and show that our proposed hybrid edge-cloud solution outperforms both the cloud-only and edge-only solutions.

LGJul 3, 2021
BAGUA: Scaling up Distributed Learning with System Relaxations

Shaoduo Gan, Xiangru Lian, Rui Wang et al.

Recent years have witnessed a growing list of systems for distributed data-parallel training. Existing systems largely fit into two paradigms, i.e., parameter server and MPI-style collective operations. On the algorithmic side, researchers have proposed a wide range of techniques to lower the communication via system relaxations: quantization, decentralization, and communication delay. However, most, if not all, existing systems only rely on standard synchronous and asynchronous stochastic gradient (SG) based optimization, therefore, cannot take advantage of all possible optimizations that the machine learning community has been developing recently. Given this emerging gap between the current landscapes of systems and theory, we build BAGUA, a MPI-style communication library, providing a collection of primitives, that is both flexible and modular to support state-of-the-art system relaxation techniques of distributed training. Powered by this design, BAGUA has a great ability to implement and extend various state-of-the-art distributed learning algorithms. In a production cluster with up to 16 machines (128 GPUs), BAGUA can outperform PyTorch-DDP, Horovod and BytePS in the end-to-end training time by a significant margin (up to 2 times) across a diverse range of tasks. Moreover, we conduct a rigorous tradeoff exploration showing that different algorithms and system relaxations achieve the best performance over different network conditions.