LGAug 16, 2024Code
Parallel Unlearning in Inherited Model NetworksXiao Liu, Mingyuan Li, Guangsheng Yu et al.
Unlearning is challenging in generic learning frameworks with the continuous growth and updates of models exhibiting complex inheritance relationships. This paper presents a novel unlearning framework that enables fully parallel unlearning among models exhibiting inheritance. We use a chronologically Directed Acyclic Graph (DAG) to capture various unlearning scenarios occurring in model inheritance networks. Central to our framework is the Fisher Inheritance Unlearning (FIUn) method, designed to enable efficient parallel unlearning within the DAG. FIUn utilizes the Fisher Information Matrix (FIM) to assess the significance of model parameters for unlearning tasks and adjusts them accordingly. To handle multiple unlearning requests simultaneously, we propose the Merging-FIM (MFIM) function, which consolidates FIMs from multiple upstream models into a unified matrix. This design supports all unlearning scenarios captured by the DAG, enabling one-shot removal of inherited knowledge while significantly reducing computational overhead. Experiments confirm the effectiveness of our unlearning framework. For single-class tasks, it achieves complete unlearning with 0% accuracy for unlearned labels while maintaining 94.53% accuracy for retained labels. For multi-class tasks, the accuracy is 1.07% for unlearned labels and 84.77% for retained labels. Our framework accelerates unlearning by 99% compared to alternative methods. Code is in https://github.com/MJLee00/Parallel-Unlearning-in-Inherited-Model-Networks.
LGDec 10, 2025Code
CFLight: Enhancing Safety with Traffic Signal Control through Counterfactual LearningMingyuan Li, Chunyu Liu, Zhuojun Li et al.
Traffic accidents result in millions of injuries and fatalities globally, with a significant number occurring at intersections each year. Traffic Signal Control (TSC) is an effective strategy for enhancing safety at these urban junctures. Despite the growing popularity of Reinforcement Learning (RL) methods in optimizing TSC, these methods often prioritize driving efficiency over safety, thus failing to address the critical balance between these two aspects. Additionally, these methods usually need more interpretability. CounterFactual (CF) learning is a promising approach for various causal analysis fields. In this study, we introduce a novel framework to improve RL for safety aspects in TSC. This framework introduces a novel method based on CF learning to address the question: ``What if, when an unsafe event occurs, we backtrack to perform alternative actions, and will this unsafe event still occur in the subsequent period?'' To answer this question, we propose a new structure causal model to predict the result after executing different actions, and we propose a new CF module that integrates with additional ``X'' modules to promote safe RL practices. Our new algorithm, CFLight, which is derived from this framework, effectively tackles challenging safety events and significantly improves safety at intersections through a near-zero collision control strategy. Through extensive numerical experiments on both real-world and synthetic datasets, we demonstrate that CFLight reduces collisions and improves overall traffic performance compared to conventional RL methods and the recent safe RL model. Moreover, our method represents a generalized and safe framework for RL methods, opening possibilities for applications in other domains. The data and code are available in the github https://github.com/MJLee00/CFLight-Enhancing-Safety-with-Traffic-Signal-Control-through-Counterfactual-Learning.
CVMar 7, 2024Code
AO-DETR: Anti-Overlapping DETR for X-Ray Prohibited Items DetectionMingyuan Li, Tong Jia, Hao Wang et al.
Prohibited item detection in X-ray images is one of the most essential and highly effective methods widely employed in various security inspection scenarios. Considering the significant overlapping phenomenon in X-ray prohibited item images, we propose an Anti-Overlapping DETR (AO-DETR) based on one of the state-of-the-art general object detectors, DINO. Specifically, to address the feature coupling issue caused by overlapping phenomena, we introduce the Category-Specific One-to-One Assignment (CSA) strategy to constrain category-specific object queries in predicting prohibited items of fixed categories, which can enhance their ability to extract features specific to prohibited items of a particular category from the overlapping foreground-background features. To address the edge blurring problem caused by overlapping phenomena, we propose the Look Forward Densely (LFD) scheme, which improves the localization accuracy of reference boxes in mid-to-high-level decoder layers and enhances the ability to locate blurry edges of the final layer. Similar to DINO, our AO-DETR provides two different versions with distinct backbones, tailored to meet diverse application requirements. Extensive experiments on the PIXray and OPIXray datasets demonstrate that the proposed method surpasses the state-of-the-art object detectors, indicating its potential applications in the field of prohibited item detection. The source code will be released at https://github.com/Limingyuan001/AO-DETR-test.
CVJun 16, 2025Code
FOAM: A General Frequency-Optimized Anti-Overlapping Framework for Overlapping Object PerceptionMingyuan Li, Tong Jia, Han Gu et al.
Overlapping object perception aims to decouple the randomly overlapping foreground-background features, extracting foreground features while suppressing background features, which holds significant application value in fields such as security screening and medical auxiliary diagnosis. Despite some research efforts to tackle the challenge of overlapping object perception, most solutions are confined to the spatial domain. Through frequency domain analysis, we observe that the degradation of contours and textures due to the overlapping phenomenon can be intuitively reflected in the magnitude spectrum. Based on this observation, we propose a general Frequency-Optimized Anti-Overlapping Framework (FOAM) to assist the model in extracting more texture and contour information, thereby enhancing the ability for anti-overlapping object perception. Specifically, we design the Frequency Spatial Transformer Block (FSTB), which can simultaneously extract features from both the frequency and spatial domains, helping the network capture more texture features from the foreground. In addition, we introduce the Hierarchical De-Corrupting (HDC) mechanism, which aligns adjacent features in the separately constructed base branch and corruption branch using a specially designed consistent loss during the training phase. This mechanism suppresses the response to irrelevant background features of FSTBs, thereby improving the perception of foreground contour. We conduct extensive experiments to validate the effectiveness and generalization of the proposed FOAM, which further improves the accuracy of state-of-the-art models on four datasets, specifically for the three overlapping object perception tasks: Prohibited Item Detection, Prohibited Item Segmentation, and Pneumonia Detection. The code will be open source once the paper is accepted.
CVJan 28, 2025Code
CSPCL: Category Semantic Prior Contrastive Learning for Deformable DETR-Based Prohibited Item DetectorsMingyuan Li, Tong Jia, Hao Wang et al.
Prohibited item detection based on X-ray images is one of the most effective security inspection methods. However, the foreground-background feature coupling caused by the overlapping phenomenon specific to X-ray images makes general detectors designed for natural images perform poorly. To address this issue, we propose a Category Semantic Prior Contrastive Learning (CSPCL) mechanism, which aligns the class prototypes perceived by the classifier with the content queries to correct and supplement the missing semantic information responsible for classification, thereby enhancing the model sensitivity to foreground features. To achieve this alignment, we design a specific contrastive loss, CSP loss, which comprises the Intra-Class Truncated Attraction (ITA) loss and the Inter-Class Adaptive Repulsion (IAR) loss, and outperforms classic contrastive losses. Specifically, the ITA loss leverages class prototypes to attract intra-class content queries and preserves essential intra-class diversity via a gradient truncation function. The IAR loss employs class prototypes to adaptively repel inter-class content queries, with the repulsion strength scaled by prototype-prototype similarity, thereby improving inter-class discriminability, especially among similar categories. CSPCL is general and can be easily integrated into Deformable DETR-based models. Extensive experiments on the PIXray, OPIXray, PIDray, and CLCXray datasets demonstrate that CSPCL significantly enhances the performance of various state-of-the-art models without increasing inference complexity. The code is publicly available at https://github.com/Limingyuan001/CSPCL.
CVFeb 2, 2025
MedConv: Convolutions Beat Transformers on Long-Tailed Bone Density PredictionXuyin Qi, Zeyu Zhang, Huazhan Zheng et al.
Bone density prediction via CT scans to estimate T-scores is crucial, providing a more precise assessment of bone health compared to traditional methods like X-ray bone density tests, which lack spatial resolution and the ability to detect localized changes. However, CT-based prediction faces two major challenges: the high computational complexity of transformer-based architectures, which limits their deployment in portable and clinical settings, and the imbalanced, long-tailed distribution of real-world hospital data that skews predictions. To address these issues, we introduce MedConv, a convolutional model for bone density prediction that outperforms transformer models with lower computational demands. We also adapt Bal-CE loss and post-hoc logit adjustment to improve class balance. Extensive experiments on our AustinSpine dataset shows that our approach achieves up to 21% improvement in accuracy and 20% in ROC AUC over previous state-of-the-art methods.
SYJan 27, 2025
FuzzyLight: A Robust Two-Stage Fuzzy Approach for Traffic Signal Control Works in Real CitiesMingyuan Li, Jiahao Wang, Bo Du et al.
Effective traffic signal control (TSC) is crucial in mitigating urban congestion and reducing emissions. Recently, reinforcement learning (RL) has been the research trend for TSC. However, existing RL algorithms face several real-world challenges that hinder their practical deployment in TSC: (1) Sensor accuracy deteriorates with increased sensor detection range, and data transmission is prone to noise, potentially resulting in unsafe TSC decisions. (2) During the training of online RL, interactions with the environment could be unstable, potentially leading to inappropriate traffic signal phase (TSP) selection and traffic congestion. (3) Most current TSC algorithms focus only on TSP decisions, overlooking the critical aspect of phase duration, affecting safety and efficiency. To overcome these challenges, we propose a robust two-stage fuzzy approach called FuzzyLight, which integrates compressed sensing and RL for TSC deployment. FuzzyLight offers several key contributions: (1) It employs fuzzy logic and compressed sensing to address sensor noise and enhances the efficiency of TSP decisions. (2) It maintains stable performance during training and combines fuzzy logic with RL to generate precise phases. (3) It works in real cities across 22 intersections and demonstrates superior performance in both real-world and simulated environments. Experimental results indicate that FuzzyLight enhances traffic efficiency by 48% compared to expert-designed timings in the real world. Furthermore, it achieves state-of-the-art (SOTA) performance in simulated environments using six real-world datasets with transmission noise. The code and deployment video are available at the URL1
CRFeb 26, 2024
BlockFUL: Enabling Unlearning in Blockchained Federated LearningXiao Liu, Mingyuan Li, Xu Wang et al.
Unlearning in Federated Learning (FL) presents significant challenges, as models grow and evolve with complex inheritance relationships. This complexity is amplified when blockchain is employed to ensure the integrity and traceability of FL, where the need to edit multiple interlinked blockchain records and update all inherited models complicates the process.In this paper, we introduce Blockchained Federated Unlearning (BlockFUL), a novel framework with a dual-chain structure comprising a live chain and an archive chain for enabling unlearning capabilities within Blockchained FL. BlockFUL introduces two new unlearning paradigms, i.e., parallel and sequential paradigms, which can be effectively implemented through gradient-ascent-based and re-training-based unlearning methods. These methods enhance the unlearning process across multiple inherited models by enabling efficient consensus operations and reducing computational costs. Our extensive experiments validate that these methods effectively reduce data dependency and operational overhead, thereby boosting the overall performance of unlearning inherited models within BlockFUL on CIFAR-10 and Fashion-MNIST datasets using AlexNet, ResNet18, and MobileNetV2 models.
CVMar 8
SketchGraphNet: A Memory-Efficient Hybrid Graph Transformer for Large-Scale Sketch Corpora RecognitionShilong Chen, Mingyuan Li, Zhaoyang Wang et al.
This work investigates large-scale sketch recognition from a graph-native perspective, where free-hand sketches are directly modeled as structured graphs rather than raster images or stroke sequences. We propose SketchGraphNet, a hybrid graph neural architecture that integrates local message passing with a memory-efficient global attention mechanism, without relying on auxiliary positional or structural encodings. To support systematic evaluation, we construct SketchGraph, a large-scale benchmark comprising 3.44 million graph-structured sketches across 344 categories, with two variants (A and R) to reflect different noise conditions. Each sketch is represented as a spatiotemporal graph with normalized stroke-order attributes. On SketchGraph-A and SketchGraph-R, SketchGraphNet achieves Top-1 accuracies of 83.62% and 87.61%, respectively, under a unified training configuration. MemEffAttn further reduces peak GPU memory by over 40% and training time by more than 30% compared with Performer-based global attention, while maintaining comparable accuracy.
CVJun 16, 2024
Open-Vocabulary X-ray Prohibited Item Detection via Fine-tuning CLIPShuyang Lin, Tong Jia, Hao Wang et al.
X-ray prohibited item detection is an essential component of security check and categories of prohibited item are continuously increasing in accordance with the latest laws. Previous works all focus on close-set scenarios, which can only recognize known categories used for training and often require time-consuming as well as labor-intensive annotations when learning novel categories, resulting in limited real-world applications. Although the success of vision-language models (e.g. CLIP) provides a new perspectives for open-set X-ray prohibited item detection, directly applying CLIP to X-ray domain leads to a sharp performance drop due to domain shift between X-ray data and general data used for pre-training CLIP. To address aforementioned challenges, in this paper, we introduce distillation-based open-vocabulary object detection (OVOD) task into X-ray security inspection domain by extending CLIP to learn visual representations in our specific X-ray domain, aiming to detect novel prohibited item categories beyond base categories on which the detector is trained. Specifically, we propose X-ray feature adapter and apply it to CLIP within OVOD framework to develop OVXD model. X-ray feature adapter containing three adapter submodules of bottleneck architecture, which is simple but can efficiently integrate new knowledge of X-ray domain with original knowledge, further bridge domain gap and promote alignment between X-ray images and textual concepts. Extensive experiments conducted on PIXray and PIDray datasets demonstrate that proposed method performs favorably against other baseline OVOD methods in detecting novel categories in X-ray scenario. It outperforms previous best result by 15.2 AP50 and 1.5 AP50 on PIXray and PIDray with achieving 21.0 AP50 and 27.8 AP50 respectively.
CVJun 5, 2024
MMCL: Correcting Content Query Distributions for Improved Anti-Overlapping X-Ray Object DetectionMingyuan Li, Tong Jia, Hui Lu et al.
Unlike natural images with occlusion-based overlap, X-ray images exhibit depth-induced superimposition and semi-transparent appearances, where objects at different depths overlap and their features blend together. These characteristics demand specialized mechanisms to disentangle mixed representations between target objects (e.g., prohibited items) and irrelevant backgrounds. While recent studies have explored adapting detection transformers (DETR) for anti-overlapping object detection, the importance of well-distributed content queries that represent object hypotheses remains underexplored. In this paper, we introduce a multi-class min-margin contrastive learning (MMCL) framework to correct the distribution of content queries, achieving balanced intra-class diversity and inter-class separability. The framework first groups content queries by object category and then applies two proposed complementary loss components: a multi-class exclusion loss to enhance inter-class separability, and a min-margin clustering loss to encourage intra-class diversity. We evaluate the proposed method on three widely used X-ray prohibited-item detection datasets, PIXray, OPIXray, and PIDray, using two backbone networks and four DETR variants. Experimental results demonstrate that MMCL effectively enhances anti-overlapping object detection and achieves state-of-the-art performance on both datasets. Code will be made publicly available on GitHub.