Shuiwang Li

h-index5

4papers

20citations

Novelty30%

AI Score34

Ranked #114,903 of 194,257 authors (top 59%)#38,330 in CV (top 65%)

4 Papers

6.8CVAug 20, 2023

Learning Disentangled Representation with Mutual Information Maximization for Real-Time UAV Tracking

Xucheng Wang, Xiangyang Yang, Hengzhou Ye et al.

Efficiency has been a critical problem in UAV tracking due to limitations in computation resources, battery capacity, and unmanned aerial vehicle maximum load. Although discriminative correlation filters (DCF)-based trackers prevail in this field for their favorable efficiency, some recently proposed lightweight deep learning (DL)-based trackers using model compression demonstrated quite remarkable CPU efficiency as well as precision. Unfortunately, the model compression methods utilized by these works, though simple, are still unable to achieve satisfying tracking precision with higher compression rates. This paper aims to exploit disentangled representation learning with mutual information maximization (DR-MIM) to further improve DL-based trackers' precision and efficiency for UAV tracking. The proposed disentangled representation separates the feature into an identity-related and an identity-unrelated features. Only the latter is used, which enhances the effectiveness of the feature representation for subsequent classification and regression tasks. Extensive experiments on four UAV benchmarks, including UAV123@10fps, DTB70, UAVDT and VisDrone2018, show that our DR-MIM tracker significantly outperforms state-of-the-art UAV tracking methods.

6.5CVApr 28, 2024Code

Tracking Transforming Objects: A Benchmark

You Wu, Yuelong Wang, Yaxin Liao et al.

Tracking transforming objects holds significant importance in various fields due to the dynamic nature of many real-world scenarios. By enabling systems accurately represent transforming objects over time, tracking transforming objects facilitates advancements in areas such as autonomous systems, human-computer interaction, and security applications. Moreover, understanding the behavior of transforming objects provides valuable insights into complex interactions or processes, contributing to the development of intelligent systems capable of robust and adaptive perception in dynamic environments. However, current research in the field mainly focuses on tracking generic objects. In this study, we bridge this gap by collecting a novel dedicated Dataset for Tracking Transforming Objects, called DTTO, which contains 100 sequences, amounting to approximately 9.3K frames. We provide carefully hand-annotated bounding boxes for each frame within these sequences, making DTTO the pioneering benchmark dedicated to tracking transforming objects. We thoroughly evaluate 20 state-of-the-art trackers on the benchmark, aiming to comprehend the performance of existing methods and provide a comparison for future research on DTTO. With the release of DTTO, our goal is to facilitate further research and applications related to tracking transforming objects.

8.4CVJul 14, 2025

Vision-Based Anti Unmanned Aerial Technology: Opportunities and Challenges

Guanghai Ding, Yihua Ren, Yuting Liu et al.

With the rapid advancement of UAV technology and its extensive application in various fields such as military reconnaissance, environmental monitoring, and logistics, achieving efficient and accurate Anti-UAV tracking has become essential. The importance of Anti-UAV tracking is increasingly prominent, especially in scenarios such as public safety, border patrol, search and rescue, and agricultural monitoring, where operations in complex environments can provide enhanced security. Current mainstream Anti-UAV tracking technologies are primarily centered around computer vision techniques, particularly those that integrate multi-sensor data fusion with advanced detection and tracking algorithms. This paper first reviews the characteristics and current challenges of Anti-UAV detection and tracking technologies. Next, it investigates and compiles several publicly available datasets, providing accessible links to support researchers in efficiently addressing related challenges. Furthermore, the paper analyzes the major vision-based and vision-fusion-based Anti-UAV detection and tracking algorithms proposed in recent years. Finally, based on the above research, this paper outlines future research directions, aiming to provide valuable insights for advancing the field.

3.6CVNov 17, 2025

CapeNext: Rethinking and refining dynamic support information for category-agnostic pose estimation

Yu Zhu, Dan Zeng, Shuiwang Li et al.

Recent research in Category-Agnostic Pose Estimation (CAPE) has adopted fixed textual keypoint description as semantic prior for two-stage pose matching frameworks. While this paradigm enhances robustness and flexibility by disentangling the dependency of support images, our critical analysis reveals two inherent limitations of static joint embedding: (1) polysemy-induced cross-category ambiguity during the matching process(e.g., the concept "leg" exhibiting divergent visual manifestations across humans and furniture), and (2) insufficient discriminability for fine-grained intra-category variations (e.g., posture and fur discrepancies between a sleeping white cat and a standing black cat). To overcome these challenges, we propose a new framework that innovatively integrates hierarchical cross-modal interaction with dual-stream feature refinement, enhancing the joint embedding with both class-level and instance-specific cues from textual description and specific images. Experiments on the MP-100 dataset demonstrate that, regardless of the network backbone, CapeNext consistently outperforms state-of-the-art CAPE methods by a large margin.