Wei Li

h-index34

4papers

146citations

Novelty60%

AI Score35

Ranked #106,225 of 194,257 authors (top 55%)#35,557 in CV (top 60%)

4 Papers

8.4CVMar 31, 2023Code

Siamese DETR

Zeren Chen, Gengshi Huang, Wei Li et al.

Recent self-supervised methods are mainly designed for representation learning with the base model, e.g., ResNets or ViTs. They cannot be easily transferred to DETR, with task-specific Transformer modules. In this work, we present Siamese DETR, a Siamese self-supervised pretraining approach for the Transformer architecture in DETR. We consider learning view-invariant and detection-oriented representations simultaneously through two complementary tasks, i.e., localization and discrimination, in a novel multi-view learning framework. Two self-supervised pretext tasks are designed: (i) Multi-View Region Detection aims at learning to localize regions-of-interest between augmented views of the input, and (ii) Multi-View Semantic Discrimination attempts to improve object-level discrimination for each region. The proposed Siamese DETR achieves state-of-the-art transfer performance on COCO and PASCAL VOC detection using different DETR variants in all setups. Code is available at https://github.com/Zx55/SiameseDETR.

16.2CVApr 19, 2021Code

ASFM-Net: Asymmetrical Siamese Feature Matching Network for Point Completion

Yaqi Xia, Yan Xia, Wei Li et al.

We tackle the problem of object completion from point clouds and propose a novel point cloud completion network employing an Asymmetrical Siamese Feature Matching strategy, termed as ASFM-Net. Specifically, the Siamese auto-encoder neural network is adopted to map the partial and complete input point cloud into a shared latent space, which can capture detailed shape prior. Then we design an iterative refinement unit to generate complete shapes with fine-grained details by integrating prior information. Experiments are conducted on the PCN dataset and the Completion3D benchmark, demonstrating the state-of-the-art performance of the proposed ASFM-Net. Our method achieves the 1st place in the leaderboard of Completion3D and outperforms existing methods with a large margin, about 12%. The codes and trained models are released publicly at https://github.com/Yan-Xia/ASFM-Net.

10.5CVMar 31, 2024

IPT-V2: Efficient Image Processing Transformer using Hierarchical Attentions

Zhijun Tu, Kunpeng Du, Hanting Chen et al.

Recent advances have demonstrated the powerful capability of transformer architecture in image restoration. However, our analysis indicates that existing transformerbased methods can not establish both exact global and local dependencies simultaneously, which are much critical to restore the details and missing content of degraded images. To this end, we present an efficient image processing transformer architecture with hierarchical attentions, called IPTV2, adopting a focal context self-attention (FCSA) and a global grid self-attention (GGSA) to obtain adequate token interactions in local and global receptive fields. Specifically, FCSA applies the shifted window mechanism into the channel self-attention, helps capture the local context and mutual interaction across channels. And GGSA constructs long-range dependencies in the cross-window grid, aggregates global information in spatial dimension. Moreover, we introduce structural re-parameterization technique to feed-forward network to further improve the model capability. Extensive experiments demonstrate that our proposed IPT-V2 achieves state-of-the-art results on various image processing tasks, covering denoising, deblurring, deraining and obtains much better trade-off for performance and computational complexity than previous methods. Besides, we extend our method to image generation as latent diffusion backbone, and significantly outperforms DiTs.

1.7ROFeb 6, 2017

From Formalised State Machines to Implementations of Robotic Controllers

Wei Li, Alvaro Miyazawa, Pedro Ribeiro et al.

Controllers for autonomous robotic systems can be specified using state machines. However, these are typically developed in an ad hoc manner without formal semantics, which makes it difficult to analyse the controller. Simulations are often used during the development, but a rigorous connection between the designed controller and the implementation is often overlooked. This paper presents a state-machine based notation, RoboChart, together with a tool to automatically create code from the state machines, establishing a rigorous connection between specification and implementation. In RoboChart, a robot's controller is specified either graphically or using a textual description language. The controller code for simulation is automatically generated through a direct mapping from the specification. We demonstrate our approach using two case studies (self-organized aggregation and swarm taxis) in swarm robotics. The simulations are presented using two different simulators showing the general applicability of our approach.