Yirui Wang

h-index34

4papers

39citations

Novelty41%

AI Score40

Ranked #76,173 of 194,257 authors (top 39%)#25,836 in CV (top 44%)

4 Papers

13.0LGJun 1, 2023Code

Improving and Benchmarking Offline Reinforcement Learning Algorithms

Bingyi Kang, Xiao Ma, Yirui Wang et al.

Recently, Offline Reinforcement Learning (RL) has achieved remarkable progress with the emergence of various algorithms and datasets. However, these methods usually focus on algorithmic advancements, ignoring that many low-level implementation choices considerably influence or even drive the final performance. As a result, it becomes hard to attribute the progress in Offline RL as these choices are not sufficiently discussed and aligned in the literature. In addition, papers focusing on a dataset (e.g., D4RL) often ignore algorithms proposed on another dataset (e.g., RL Unplugged), causing isolation among the algorithms, which might slow down the overall progress. Therefore, this work aims to bridge the gaps caused by low-level choices and datasets. To this end, we empirically investigate 20 implementation choices using three representative algorithms (i.e., CQL, CRR, and IQL) and present a guidebook for choosing implementations. Following the guidebook, we find two variants CRR+ and CQL+ , achieving new state-of-the-art on D4RL. Moreover, we benchmark eight popular offline RL algorithms across datasets under unified training and evaluation framework. The findings are inspiring: the success of a learning paradigm severely depends on the data distribution, and some previous conclusions are biased by the dataset used. Our code is available at https://github.com/sail-sg/offbench.

1.4CVJul 22, 2022

PieTrack: An MOT solution based on synthetic data training and self-supervised domain adaptation

Yirui Wang, Shenghua He, Youbao Tang et al.

In order to cope with the increasing demand for labeling data and privacy issues with human detection, synthetic data has been used as a substitute and showing promising results in human detection and tracking tasks. We participate in the 7th Workshop on Benchmarking Multi-Target Tracking (BMTT), themed on "How Far Can Synthetic Data Take us"? Our solution, PieTrack, is developed based on synthetic data without using any pre-trained weights. We propose a self-supervised domain adaptation method that enables mitigating the domain shift issue between the synthetic (e.g., MOTSynth) and real data (e.g., MOT17) without involving extra human labels. By leveraging the proposed multi-scale ensemble inference, we achieved a final HOTA score of 58.7 on the MOT17 testing set, ranked third place in the challenge.

17.8ROApr 16

ShapeGen: Robotic Data Generation for Category-Level Manipulation

Yirui Wang, Xiuwei Xu, Angyuan Ma et al.

Manipulation policies deployed in uncontrolled real-world scenarios are faced with great in-category geometric diversity of everyday objects. In order to function robustly under such variations, policies need to work in a category-level manner, i.e. knowing how to interact with any object in a certain category, instead of only a specific one seen during training. This in-category generalizability is usually nurtured with shape-diversified training data; however, manually collecting such a corpus of data is infeasible due to the requirement of intense human labor and large collections of divergent objects at hand. In this paper, we propose ShapeGen, a data generation method that aims at generating shape-variated manipulation data in a simulator-free and 3D manner. ShapeGen decomposes the process into two stages: Shape Library curation and Function-Aware Generation. In the first stage, we train spatial warpings between shapes mapping points to points that correspond functionally, and aggregate 3D models along with the warpings into a plug-and-play Shape Library. In the second stage, we design a pipeline that, leveraging established Libraries, requires only minimal human annotation to generate physically plausible and functionally correct novel demonstrations. Experiments in the real world demonstrate the effectiveness of ShapeGen to boost policies' in-category shape generalizability. Project page: https://wangyr22.github.io/ShapeGen/.

5.4NEJan 27, 2021

ASBSO: An Improved Brain Storm Optimization With Flexible Search Length and Memory-Based Selection

Yang Yu, Shangce Gao, Yirui Wang et al.

Brain storm optimization (BSO) is a newly proposed population-based optimization algorithm, which uses a logarithmic sigmoid transfer function to adjust its search range during the convergent process. However, this adjustment only varies with the current iteration number and lacks of flexibility and variety which makes a poor search effciency and robustness of BSO. To alleviate this problem, an adaptive step length structure together with a success memory selection strategy is proposed to be incorporated into BSO. This proposed method, adaptive step length based on memory selection BSO, namely ASBSO, applies multiple step lengths to modify the generation process of new solutions, thus supplying a flexible search according to corresponding problems and convergent periods. The novel memory mechanism, which is capable of evaluating and storing the degree of improvements of solutions, is used to determine the selection possibility of step lengths. A set of 57 benchmark functions are used to test ASBSO's search ability, and four real-world problems are adopted to show its application value. All these test results indicate the remarkable improvement in solution quality, scalability, and robustness of ASBSO.