Haotong Cheng

2papers

2 Papers

59.0CVMar 20Code
Vision-Language Attribute Disentanglement and Reinforcement for Lifelong Person Re-Identification

Kunlun Xu, Haotong Cheng, Jiangmeng Li et al.

Lifelong person re-identification (LReID) aims to learn from varying domains to obtain a unified person retrieval model. Existing LReID approaches typically focus on learning from scratch or a visual classification-pretrained model, while the Vision-Language Model (VLM) has shown generalizable knowledge in a variety of tasks. Although existing methods can be directly adapted to the VLM, since they only consider global-aware learning, the fine-grained attribute knowledge is underleveraged, leading to limited acquisition and anti-forgetting capacity. To address this problem, we introduce a novel VLM-driven LReID approach named Vision-Language Attribute Disentanglement and Reinforcement (VLADR). Our key idea is to explicitly model the universally shared human attributes to improve inter-domain knowledge transfer, thereby effectively utilizing historical knowledge to reinforce new knowledge learning and alleviate forgetting. Specifically, VLADR includes a Multi-grain Text Attribute Disentanglement mechanism that mines the global and diverse local text attributes of an image. Then, an Inter-domain Cross-modal Attribute Reinforcement scheme is developed, which introduces cross-modal attribute alignment to guide visual attribute extraction and adopts inter-domain attribute alignment to achieve fine-grained knowledge transfer. Experimental results demonstrate that our VLADR outperforms the state-of-the-art methods by 1.9\%-2.2\% and 2.1\%-2.5\% on anti-forgetting and generalization capacity. Our source code is available at https://github.com/zhoujiahuan1991/CVPR2026-VLADR

CVMay 6, 2025
Optimization of Module Transferability in Single Image Super-Resolution: Universality Assessment and Cycle Residual Blocks

Haotong Cheng, Zhiqi Zhang, Hao Li et al.

Deep learning has substantially advanced the field of Single Image Super-Resolution (SISR). However, existing research has predominantly focused on raw performance gains, with little attention paid to quantifying the transferability of architectural components. In this paper, we introduce the concept of "Universality" and its associated definitions, which extend the traditional notion of "Generalization" to encompass the ease of transferability of modules. We then propose the Universality Assessment Equation (UAE), a metric that quantifies how readily a given module can be transplanted across models and reveals the combined influence of multiple existing metrics on transferability. Guided by the UAE results of standard residual blocks and other plug-and-play modules, we further design two optimized modules: the Cycle Residual Block (CRB) and the Depth-Wise Cycle Residual Block (DCRB). Through comprehensive experiments on natural-scene benchmarks, remote-sensing datasets, and other low-level tasks, we demonstrate that networks embedded with the proposed plug-and-play modules outperform several state-of-the-art methods, achieving a PSNR improvement of up to 0.83 dB or enabling a 71.3% reduction in parameters with negligible loss in reconstruction fidelity. Similar optimization approaches could be applied to a broader range of basic modules, offering a new paradigm for the design of plug-and-play modules.