Yunkai Wang

RO
h-index10
11papers
61citations
Novelty41%
AI Score34

11 Papers

CVMar 7, 2022
Depth-Independent Depth Completion via Least Square Estimation

Xianze Fang, Yunkai Wang, Zexi Chen et al.

The depth completion task aims to complete a per-pixel dense depth map from a sparse depth map. In this paper, we propose an efficient least square based depth-independent method to complete the sparse depth map utilizing the RGB image and the sparse depth map in two independent stages. In this way can we decouple the neural network and the sparse depth input, so that when some features of the sparse depth map change, such as the sparsity, our method can still produce a promising result. Moreover, due to the positional encoding and linear procession in our pipeline, we can easily produce a super-resolution dense depth map of high quality. We also test the generalization of our method on different datasets compared to some state-of-the-art algorithms. Experiments on the benchmark show that our method produces competitive performance.

CVOct 31, 2020Code
PREGAN: Pose Randomization and Estimation for Weakly Paired Image Style Translation

Zexi Chen, Jiaxin Guo, Xuecheng Xu et al.

Utilizing the trained model under different conditions without data annotation is attractive for robot applications. Towards this goal, one class of methods is to translate the image style from another environment to the one on which models are trained. In this paper, we propose a weakly-paired setting for the style translation, where the content in the two images is aligned with errors in poses. These images could be acquired by different sensors in different conditions that share an overlapping region, e.g. with LiDAR or stereo cameras, from sunny days or foggy nights. We consider this setting to be more practical with: (i) easier labeling than the paired data; (ii) better interpretability and detail retrieval than the unpaired data. To translate across such images, we propose PREGAN to train a style translator by intentionally transforming the two images with a random pose, and to estimate the given random pose by differentiable non-trainable pose estimator given that the more aligned in style, the better the estimated result is. Such adversarial training enforces the network to learn the style translation, avoiding being entangled with other variations. Finally, PREGAN is validated on both simulated and real-world collected data to show the effectiveness. Results on down-stream tasks, classification, road segmentation, object detection, and feature matching show its potential for real applications. https://github.com/wrld/PRoGAN

AISep 30, 2025
Probing the Critical Point (CritPt) of AI Reasoning: a Frontier Physics Research Benchmark

Minhui Zhu, Minyang Tian, Xiaocheng Yang et al.

While large language models (LLMs) with reasoning capabilities are progressing rapidly on high-school math competitions and coding, can they reason effectively through complex, open-ended challenges found in frontier physics research? And crucially, what kinds of reasoning tasks do physicists want LLMs to assist with? To address these questions, we present the CritPt (Complex Research using Integrated Thinking - Physics Test, pronounced "critical point"), the first benchmark designed to test LLMs on unpublished, research-level reasoning tasks that broadly covers modern physics research areas, including condensed matter, quantum physics, atomic, molecular & optical physics, astrophysics, high energy physics, mathematical physics, statistical physics, nuclear physics, nonlinear dynamics, fluid dynamics and biophysics. CritPt consists of 71 composite research challenges designed to simulate full-scale research projects at the entry level, which are also decomposed to 190 simpler checkpoint tasks for more fine-grained insights. All problems are newly created by 50+ active physics researchers based on their own research. Every problem is hand-curated to admit a guess-resistant and machine-verifiable answer and is evaluated by an automated grading pipeline heavily customized for advanced physics-specific output formats. We find that while current state-of-the-art LLMs show early promise on isolated checkpoints, they remain far from being able to reliably solve full research-scale challenges: the best average accuracy among base models is only 5.7%, achieved by GPT-5 (high), moderately rising to around 10% when equipped with coding tools. Through the realistic yet standardized evaluation offered by CritPt, we highlight a large disconnect between current model capabilities and realistic physics research demands, offering a foundation to guide the development of scientifically grounded AI tools.

CVSep 22, 2021
Domain Generalization for Vision-based Driving Trajectory Generation

Yunkai Wang, Dongkun Zhang, Yuxiang Cui et al.

One of the challenges in vision-based driving trajectory generation is dealing with out-of-distribution scenarios. In this paper, we propose a domain generalization method for vision-based driving trajectory generation for autonomous vehicles in urban environments, which can be seen as a solution to extend the Invariant Risk Minimization (IRM) method in complex problems. We leverage an adversarial learning approach to train a trajectory generator as the decoder. Based on the pre-trained decoder, we infer the latent variables corresponding to the trajectories, and pre-train the encoder by regressing the inferred latent variable. Finally, we fix the decoder but fine-tune the encoder with the final trajectory loss. We compare our proposed method with the state-of-the-art trajectory generation method and some recent domain generalization methods on both datasets and simulation, demonstrating that our method has better generalization ability.

CVMar 7, 2021
Learn to Differ: Sim2Real Small Defection Segmentation Network

Zexi Chen, Zheyuan Huang, Yunkai Wang et al.

Recent studies on deep-learning-based small defection segmentation approaches are trained in specific settings and tend to be limited by fixed context. Throughout the training, the network inevitably learns the representation of the background of the training data before figuring out the defection. They underperform in the inference stage once the context changed and can only be solved by training in every new setting. This eventually leads to the limitation in practical robotic applications where contexts keep varying. To cope with this, instead of training a network context by context and hoping it to generalize, why not stop misleading it with any limited context and start training it with pure simulation? In this paper, we propose the network SSDS that learns a way of distinguishing small defections between two images regardless of the context, so that the network can be trained once for all. A small defection detection layer utilizing the pose sensitivity of phase correlation between images is introduced and is followed by an outlier masking layer. The network is trained on randomly generated simulated data with simple shapes and is generalized across the real world. Finally, SSDS is validated on real-world collected data and demonstrates the ability that even when trained in cheap simulation, SSDS can still find small defections in the real world showing the effectiveness and its potential for practical applications.

ROOct 20, 2020
Imitation Learning of Hierarchical Driving Model: from Continuous Intention to Continuous Trajectory

Yunkai Wang, Dongkun Zhang, Jingke Wang et al.

One of the challenges to reduce the gap between the machine and the human level driving is how to endow the system with the learning capacity to deal with the coupled complexity of environments, intentions, and dynamics. In this paper, we propose a hierarchical driving model with explicit model of continuous intention and continuous dynamics, which decouples the complexity in the observation-to-action reasoning in the human driving data. Specifically, the continuous intention module takes the route planning map obtained by GPS and IMU, perception from a RGB camera and LiDAR as input to generate a potential map encoded with obstacles and intentions being expressed as grid based potentials. Then, the potential map is regarded as a condition, together with the current dynamics, to generate a continuous trajectory as output by a continuous function approximator network, whose derivatives can be used for supervision without additional parameters. Finally, we validate our method on both datasets and simulator, demonstrating that our method has higher prediction accuracy of displacement and velocity and generates smoother trajectories. The method is also deployed on the real vehicle with loop latency, validating its effectiveness. To the best of our knowledge, this is the first work to produce the driving trajectory using a continuous function approximator network.

ROJul 22, 2020
Collaborative Localization of Aerial and Ground Mobile Robots through Orthomosaic Map

Xuecheng Xu, Zexi Chen, Jiaxin Guo et al.

With the deepening of research on the SLAM system, the possibility of cooperative SLAM with multi-robots has been proposed. This paper presents a map matching and localization approach considering the cooperative SLAM of an aerial-ground system. The proposed approach aims to help precisely matching the map constructed by two independent systems that have large scale variance of viewpoints of the same route and eventually enables the ground mobile robot to localize itself in the global map given by the drone. It contains dense mapping with Elevation Map and software "Metashape", map matching with a proposed template matching algorithm, weighted normalized cross-correlation (WNCC) and localization with particle filter. The approach enables map matching for cooperative SLAM with the feasibility of multiple scene sensors, varies from stereo cameras to lidars, and is insensitive to the synchronization of the two systems. We demonstrate the accuracy, robustness, and the speed of the approach under experiments of the Aero-Ground Dataset.

ROSep 30, 2019
Multi-agent Collaboration for Feasible Collaborative Behavior Construction and Evaluation

Yunkai Wang, Shenhan Jia, Zexi Chen et al.

In the case of the two-person zero-sum stochastic game with a central controller, this paper proposes a best collaborative behavior search and selection algorithm based on reinforcement learning, in response to how to choose the best collaborative object and action for the central controller. In view of the existing multi-agent collaboration and confrontation reinforcement learning methods, the methods of traversing all actions in a certain state leads to the problem of long calculation time and unsafe policy exploration. This paper proposes to construct a feasible collaborative behavior set by using action space discretization, establishing models of both sides, model-based prediction and parallel search. Then, we use the deep q-learning method in reinforcement learning to train the scoring function to select the optimal collaboration behavior from the feasible collaborative behavior set. This method enables efficient and accurate calculation in an environment with strong confrontation, high dynamics and a large number of agents, which is verified by the RoboCup Small Size League robots passing collaboration.

ROSep 17, 2019
Champion Team Paper: Dynamic Passing-Shooting Algorithm Based on CUDA of The RoboCup SSL 2019 Champion

Zexi Chen, Haodong Zhang, Dashun Guo et al.

ZJUNlict became the Small Size League Champion of RoboCup 2019 with 6 victories and 1 tie for their 7 games. The overwhelming ability of ball-handling and passing allows ZJUNlict to greatly threaten its opponent and almost kept its goal clear without being threatened. This paper presents the core technology of its ball-handling and robot movement which consist of hardware optimization, dynamic passing and shooting strategy, and multi-agent cooperation and formation. We first describe the mechanical optimization on the placement of the capacitors, the redesign of the damping system of the dribbler and the electrical optimization on the replacement of the core chip. We then describe our passing point algorithm. The passing and shooting strategy can be separated into two different parts, where we search the passing point on SBIP-DPPS and evaluate the point based on the ball model. The statements and the conclusion should be supported by the performances and log of games on Small Size League RoboCup 2019.

ROMay 24, 2019
Mechatronic Design of a Dribbling System for RoboCup Small Size Robot

Zheyuan Huang, Yunkai Wang, Lingyun Chen et al.

RoboCup SSL is an excellent platform for researching artificial intelligence and robotics. The dribbling system is an essential issue, which is the main part for completing advanced soccer skills such as trapping and dribbling. In this paper, we designed a new dribbling system for SSL robots, including mechatronics design and control algorithms. For the mechatronics design, we analysed and exposed the 3-touch-point model with the simulation in ADAMS. In the motor controller algorithm, we use reinforcement learning to control the torque output. Finally we verified the results on the robot.

ROMay 22, 2019
ZJUNlict Extended Team Description Paper for RoboCup 2019

Zheyuan Huang, Lingyun Chen, Jiacheng Li et al.

For the Small Size League of RoboCup 2018, Team ZJUNLict has won the champion and therefore, this paper thoroughly described the devotion which ZJUNLict has devoted and the effort that ZJUNLict has contributed. There are three mean optimizations for the mechanical part which accounted for most of our incredible goals, they are "Touching Point Optimization", "Damping System Optimization", and "Dribbler Optimization". For the electrical part, we realized "Direct Torque Control", "Efficient Radio Communication Protocol" which will be credited for stabilizing the dribbler and a more secure communication between robots and the computer. Our software group contributed as much as our hardware group with the effort of "Vision Lost Compensation" to predict the movement by kalman filter, and "Interception Prediction Algorithm" to achieve some skills and improve our ball possession rate.