Yuchao Wang

h-index9

4papers

510citations

Novelty48%

AI Score45

Ranked #42,643 of 194,257 authors (top 22%)#14,973 in CV (top 25%)

4 Papers

37.7CVMar 8, 2022Code

Semi-Supervised Semantic Segmentation Using Unreliable Pseudo-Labels

Yuchao Wang, Haochen Wang, Yujun Shen et al.

The crux of semi-supervised semantic segmentation is to assign adequate pseudo-labels to the pixels of unlabeled images. A common practice is to select the highly confident predictions as the pseudo ground-truth, but it leads to a problem that most pixels may be left unused due to their unreliability. We argue that every pixel matters to the model training, even its prediction is ambiguous. Intuitively, an unreliable prediction may get confused among the top classes (i.e., those with the highest probabilities), however, it should be confident about the pixel not belonging to the remaining classes. Hence, such a pixel can be convincingly treated as a negative sample to those most unlikely categories. Based on this insight, we develop an effective pipeline to make sufficient use of unlabeled data. Concretely, we separate reliable and unreliable pixels via the entropy of predictions, push each unreliable pixel to a category-wise queue that consists of negative samples, and manage to train the model with all candidate pixels. Considering the training evolution, where the prediction becomes more and more accurate, we adaptively adjust the threshold for the reliable-unreliable partition. Experimental results on various benchmarks and training settings demonstrate the superiority of our approach over the state-of-the-art alternatives.

7.0NAMay 25

Effective algorithms for tensor train decomposition via the UTV framework

Yuchao Wang, Maolin Che, Yimin Wei

The tensor-train (TT) decomposition is widely used to compress large tensors into a more compact form by exploiting their inherent data structures. A fundamental approach for constructing the TT format is the well-known TT-SVD method, which performs singular value decompositions (SVDs) on the successive matrices sequentially. But in practical applications, it is often unnecessary to compute full SVDs. In this article, we propose a new method called the TT-UTV. It utilizes the virtues of rank-revealing UTV decomposition to compute the TT format for a large-scale tensor, resulting in lower computational cost. We analyze the error bounds on the accuracy of these algorithms in both the URV and ULV cases and then recommend different sweep patterns for these two cases. Based on the theoretical analysis, we also formulate the rank-adaptive algorithms with prescribed accuracy. Numerical experiments on various applications, including magnetic resonance imaging data completion, are performed to illustrate their good performance in practice.

8.7CVJun 11, 2024

RS-DFM: A Remote Sensing Distributed Foundation Model for Diverse Downstream Tasks

Zhechao Wang, Peirui Cheng, Pengju Tian et al.

Remote sensing lightweight foundation models have achieved notable success in online perception within remote sensing. However, their capabilities are restricted to performing online inference solely based on their own observations and models, thus lacking a comprehensive understanding of large-scale remote sensing scenarios. To overcome this limitation, we propose a Remote Sensing Distributed Foundation Model (RS-DFM) based on generalized information mapping and interaction. This model can realize online collaborative perception across multiple platforms and various downstream tasks by mapping observations into a unified space and implementing a task-agnostic information interaction strategy. Specifically, we leverage the ground-based geometric prior of remote sensing oblique observations to transform the feature mapping from absolute depth estimation to relative depth estimation, thereby enhancing the model's ability to extract generalized features across diverse heights and perspectives. Additionally, we present a dual-branch information compression module to decouple high-frequency and low-frequency feature information, achieving feature-level compression while preserving essential task-agnostic details. In support of our research, we create a multi-task simulation dataset named AirCo-MultiTasks for multi-UAV collaborative observation. We also conduct extensive experiments, including 3D object detection, instance segmentation, and trajectory prediction. The numerous results demonstrate that our RS-DFM achieves state-of-the-art performance across various downstream tasks.

11.3CVJun 7, 2024

UVCPNet: A UAV-Vehicle Collaborative Perception Network for 3D Object Detection

Yuchao Wang, Peirui Cheng, Pengju Tian et al.

With the advancement of collaborative perception, the role of aerial-ground collaborative perception, a crucial component, is becoming increasingly important. The demand for collaborative perception across different perspectives to construct more comprehensive perceptual information is growing. However, challenges arise due to the disparities in the field of view between cross-domain agents and their varying sensitivity to information in images. Additionally, when we transform image features into Bird's Eye View (BEV) features for collaboration, we need accurate depth information. To address these issues, we propose a framework specifically designed for aerial-ground collaboration. First, to mitigate the lack of datasets for aerial-ground collaboration, we develop a virtual dataset named V2U-COO for our research. Second, we design a Cross-Domain Cross-Adaptation (CDCA) module to align the target information obtained from different domains, thereby achieving more accurate perception results. Finally, we introduce a Collaborative Depth Optimization (CDO) module to obtain more precise depth estimation results, leading to more accurate perception outcomes. We conduct extensive experiments on both our virtual dataset and a public dataset to validate the effectiveness of our framework. Our experiments on the V2U-COO dataset and the DAIR-V2X dataset demonstrate that our method improves detection accuracy by 6.1% and 2.7%, respectively.