Yuzhou Peng

2papers

2 Papers

CVNov 14, 2022
SportsTrack: An Innovative Method for Tracking Athletes in Sports Scenes

Jie Wang, Yuzhou Peng, Xiaodong Yang et al.

The SportsMOT dataset aims to solve multiple object tracking of athletes in different sports scenes such as basketball or soccer. The dataset is challenging because of the unstable camera view, athletes' complex trajectory, and complicated background. Previous MOT methods can not match enough high-quality tracks of athletes. To pursue higher performance of MOT in sports scenes, we introduce an innovative tracker named SportsTrack, we utilize tracking by detection as our detection paradigm. Then we will introduce a three-stage matching process to solve the motion blur and body overlapping in sports scenes. Meanwhile, we present another innovation point: one-to-many correspondence between detection bboxes and crowded tracks to handle the overlap of athletes' bodies during sports competitions. Compared to other trackers such as BOT-SORT and ByteTrack, We carefully restored edge-lost tracks that were ignored by other trackers. Finally, we reached the SOTA result in the SportsMOT dataset.

CVMay 16, 2023
Multi-modal Visual Understanding with Prompts for Semantic Information Disentanglement of Image

Yuzhou Peng

Multi-modal visual understanding of images with prompts involves using various visual and textual cues to enhance the semantic understanding of images. This approach combines both vision and language processing to generate more accurate predictions and recognition of images. By utilizing prompt-based techniques, models can learn to focus on certain features of an image to extract useful information for downstream tasks. Additionally, multi-modal understanding can improve upon single modality models by providing more robust representations of images. Overall, the combination of visual and textual information is a promising area of research for advancing image recognition and understanding. In this paper we will try an amount of prompt design methods and propose a new method for better extraction of semantic information