Yijun Yuan

RO
h-index9
16papers
104citations
Novelty50%
AI Score48

16 Papers

CVJun 17, 2022Code
An Algorithm for the SE(3)-Transformation on Neural Implicit Maps for Remapping Functions

Yijun Yuan, Andreas Nuechter

Implicit representations are widely used for object reconstruction due to their efficiency and flexibility. In 2021, a novel structure named neural implicit map has been invented for incremental reconstruction. A neural implicit map alleviates the problem of inefficient memory cost of previous online 3D dense reconstruction while producing better quality. % However, the neural implicit map suffers the limitation that it does not support remapping as the frames of scans are encoded into a deep prior after generating the neural implicit map. This means, that neither this generation process is invertible, nor a deep prior is transformable. The non-remappable property makes it not possible to apply loop-closure techniques. % We present a neural implicit map based transformation algorithm to fill this gap. As our neural implicit map is transformable, our model supports remapping for this special map of latent features. % Experiments show that our remapping module is capable to well-transform neural implicit maps to new poses. Embedded into a SLAM framework, our mapping model is able to tackle the remapping of loop closures and demonstrates high-quality surface reconstruction. % Our implementation is available at github\footnote{\url{https://github.com/Jarrome/IMT_Mapping}} for the research community.

CVApr 29, 2023Code
NSLF-OL: Online Learning of Neural Surface Light Fields alongside Real-time Incremental 3D Reconstruction

Yijun Yuan, Andreas Nuchter

Immersive novel view generation is an important technology in the field of graphics and has recently also received attention for operator-based human-robot interaction. However, the involved training is time-consuming, and thus the current test scope is majorly on object capturing. This limits the usage of related models in the robotics community for 3D reconstruction since robots (1) usually only capture a very small range of view directions to surfaces that cause arbitrary predictions on unseen, novel direction, (2) requires real-time algorithms, and (3) work with growing scenes, e.g., in robotic exploration. The paper proposes a novel Neural Surface Light Fields model that copes with the small range of view directions while producing a good result in unseen directions. Exploiting recent encoding techniques, the training of our model is highly efficient. In addition, we design Multiple Asynchronous Neural Agents (MANA), a universal framework to learn each small region in parallel for large-scale growing scenes. Our model learns online the Neural Surface Light Fields (NSLF) aside from real-time 3D reconstruction with a sequential data stream as the shared input. In addition to online training, our model also provides real-time rendering after completing the data stream for visualization. We implement experiments using well-known RGBD indoor datasets, showing the high flexibility to embed our model into real-time 3D reconstruction and demonstrating high-fidelity view synthesis for these scenes. The code is available on github.

CVMar 22, 2023
Uni-Fusion: Universal Continuous Mapping

Yijun Yuan, Andreas Nuechter

We present Uni-Fusion, a universal continuous mapping framework for surfaces, surface properties (color, infrared, etc.) and more (latent features in CLIP embedding space, etc.). We propose the first universal implicit encoding model that supports encoding of both geometry and different types of properties (RGB, infrared, features, etc.) without requiring any training. Based on this, our framework divides the point cloud into regular grid voxels and generates a latent feature in each voxel to form a Latent Implicit Map (LIM) for geometries and arbitrary properties. Then, by fusing a local LIM frame-wisely into a global LIM, an incremental reconstruction is achieved. Encoded with corresponding types of data, our Latent Implicit Map is capable of generating continuous surfaces, surface property fields, surface feature fields, and all other possible options. To demonstrate the capabilities of our model, we implement three applications: (1) incremental reconstruction for surfaces and color (2) 2D-to-3D transfer of fabricated properties (3) open-vocabulary scene understanding by creating a text CLIP feature field on surfaces. We evaluate Uni-Fusion by comparing it in corresponding applications, from which Uni-Fusion shows high-flexibility in various applications while performing best or being competitive. The project page of Uni-Fusion is available at https://jarrome.github.io/Uni-Fusion/ .

58.1CVMar 28
Complet4R: Geometric Complete 4D Reconstruction

Weibang Wang, Kenan Li, Zhuoguang Chen et al.

We introduce Complet4R, a novel end-to-end framework for Geometric Complete 4D Reconstruction, which aims to recover temporally coherent and geometrically complete reconstruction for dynamic scenes. Our method formalizes the task of Geometric Complete 4D Reconstruction as a unified framework of reconstruction and completion, by directly accumulating full contexts onto each frame. Unlike previous approaches that rely on pairwise reconstruction or local motion estimation, Complet4R utilizes a decoder-only transformer to operate all context globally directly from sequential video input, reconstructing a complete geometry for every single timestamp, including occluded regions visible in other frames. Our method demonstrates the state-of-the-art performance on our proposed benchmark for Geometric Complete 4D Reconstruction and the 3D Point Tracking task. Code will be released to support future research.

8.3CRApr 22
VRSafe: A Secure Virtual Keyboard to Mitigate Keystroke Inference in Virtual Reality

Yijun Yuan, Na Du, Adam J. Lee et al.

Password-based authentication is one of the most commonly used methods for verifying user identities, and its widespread usage continues in virtual reality (VR) applications. As a result, various forms of attacks on password-based authentication in traditional environments such as keystroke inference and shoulder surfing, are still effective in VR applications. While keystroke inference attacks on virtual keyboards have been studied extensively, few efforts have developed an effective and cost-efficient defense strategy to mitigate keystroke inferences in VR. To address this gap, this paper presents a novel QWERTY keyboard called \textit{VRSafe} that is resilient to keystroke inference attacks. The proposed keyboard carefully introduces false positive keystrokes into the information collected by attackers during the typing process, making the inference of the original password difficult. \textit{VRSafe} also incorporates a novel malicious login detector that can effectively identify unauthorized login attempts using credentials inferred from keystroke inference attacks with high detection rate and minimal time and memory cost. The proposed design is evaluated through both simulation experiments and a real-world user study, and the results show that \textit{VRSafe} can significantly reduce the accuracy of keystroke inference attacks while incurring a modest overhead from a usability standpoint.

CVMay 13, 2024
SceneFactory: A Workflow-centric and Unified Framework for Incremental Scene Modeling

Yijun Yuan, Michael Bleier, Andreas Nüchter

We present SceneFactory, a workflow-centric and unified framework for incremental scene modeling, that conveniently supports a wide range of applications, such as (unposed and/or uncalibrated) multi-view depth estimation, LiDAR completion, (dense) RGB-D/RGB-L/Mono/Depth-only reconstruction and SLAM. The workflow-centric design uses multiple blocks as the basis for constructing different production lines. The supported applications, i.e., productions avoid redundancy in their designs. Thus, the focus is placed on each block itself for independent expansion. To support all input combinations, our implementation consists of four building blocks that form SceneFactory: (1) tracking, (2) flexion, (3) depth estimation, and (4) scene reconstruction. The tracking block is based on Mono SLAM and is extended to support RGB-D and RGB-LiDAR (RGB-L) inputs. Flexion is used to convert the depth image (untrackable) into a trackable image. For general-purpose depth estimation, we propose an unposed \& uncalibrated multi-view depth estimation model (U$^2$-MVD) to estimate dense geometry. U$^2$-MVD exploits dense bundle adjustment to solve for poses, intrinsics, and inverse depth. A semantic-aware ScaleCov step is then introduced to complete the multi-view depth. Relying on U$^2$-MVD, SceneFactory both supports user-friendly 3D creation (with just images) and bridges the applications of Dense RGB-D and Dense Mono. For high-quality surface and color reconstruction, we propose Dual-purpose Multi-resolutional Neural Points (DM-NPs) for the first surface accessible Surface Color Field design, where we introduce Improved Point Rasterization (IPR) for point cloud based surface query. ...

CVSep 21, 2025
SLAM-Former: Putting SLAM into One Transformer

Yijun Yuan, Zhuoguang Chen, Kenan Li et al.

We present SLAM-Former, a novel neural approach that integrates full SLAM capabilities into a single transformer. Similar to traditional SLAM systems, SLAM-Former comprises both a frontend and a backend that operate in tandem. The frontend processes sequential monocular images in real-time for incremental mapping and tracking, while the backend performs global refinement to ensure a geometrically consistent result. This alternating execution allows the frontend and backend to mutually promote one another, enhancing overall system performance. Comprehensive experimental results demonstrate that SLAM-Former achieves superior or highly competitive performance compared to state-of-the-art dense SLAM methods.

RONov 17, 2020
Improved Visual-Inertial Localization for Low-cost Rescue Robots

Xiaoling Long, Qingwen Xu, Yijun Yuan et al.

This paper improves visual-inertial systems to boost the localization accuracy for low-cost rescue robots. When robots traverse on rugged terrain, the performance of pose estimation suffers from big noise on the measurements of the inertial sensors due to ground contact forces, especially for low-cost sensors. Therefore, we propose \textit{Threshold}-based and \textit{Dynamic Time Warping}-based methods to detect abnormal measurements and mitigate such faults. The two methods are embedded into the popular VINS-Mono system to evaluate their performance. Experiments are performed on simulation and real robot data, which show that both methods increase the pose estimation accuracy. Moreover, the \textit{Threshold}-based method performs better when the noise is small and the \textit{Dynamic Time Warping}-based one shows greater potential on large noise.

ROMar 11, 2020
Self-supervised Point Set Local Descriptors for Point Cloud Registration

Yijun Yuan, Jiawei Hou, Andreas Nüchter et al.

In this work, we propose to learn local descriptors for point clouds in a self-supervised manner. In each iteration of the training, the input of the network is merely one unlabeled point cloud. On top of our previous work, that directly solves the transformation between two point sets in one step without correspondences, the proposed method is able to train from one point cloud, by supervising its self-rotation, that we randomly generate. The whole training requires no manual annotation. In several experiments we evaluate the performance of our method on various datasets and compare to other state of the art algorithms. The results show, that our self-supervised learned descriptor achieves equivalent or even better performance than the supervised learned model, while being easier to train and not requiring labeled data.

ROMar 1, 2020
Non-iterative One-step Solution for Point Set Registration Problem on Pose Estimation without Correspondence

Yijun Yuan, Dorit Borrmann, Andreas Nüchter et al.

In this work, we propose to directly find the one-step solution for the point set registration problem without correspondences. Inspired by the Kernel Correlation method, we consider the fully connected objective function between two point sets, thus avoiding the computation of correspondences. By utilizing least square minimization, the transformed objective function is directly solved with existing well-known closed-form solutions, e.g., singular value decomposition, that is usually used for given correspondences. However, using equal weights of costs for each connection will degenerate the solution due to the large influence of distant pairs. Thus, we additionally set a scale on each term to avoid high costs on non-important pairs. As in feature-based registration methods, the similarity between descriptors of points determines the scaling weight. Given the weights, we get a one step solution. As the runtime is in $\mathcal O (n^2)$, we also propose a variant with keypoints that strongly reduces the cost. The experiments show that the proposed method gives a one-step solution without an initial guess. Our method exhibits competitive outlier robustness and accuracy, compared to various other methods, and it is more stable in case of large rotations. Additionally, our one-step solution achieves a performance on-par with the state-of-the-art feature based method TEASER.

ROOct 1, 2019
Area Graph: Generation of Topological Maps using the Voronoi Diagram

Jiawei Hou, Yijun Yuan, Sören Schwertfeger

Representing a scanned map of the real environment as a topological structure is an important research topic in robotics. Since topological representations of maps save a huge amount of map storage space and online computing time, they are widely used in fields such as path planning, map matching, and semantic mapping. We use a topological map representation, the Area Graph, in which the vertices represent areas and edges represent passages. The Area Graph is developed from a pruned Voronoi Graph, the Topology Graph. We also employ a simple room detection algorithm to compensate the fact that the Voronoi Graph gets unstable in open areas. We claim that our area segmentation method is superior to state-of-the-art approaches in complex indoor environments and support this claim with a number of experiments.

ROSep 17, 2019
Configuration-Space Flipper Planning on 3D Terrain

Yijun Yuan, Qingwen Xu, Sören Schwertfeger

Flippers are essential components of tracked robot locomotion systems for unstructured terrain, especially within a rescue scenario. Achieving full and semi-autonomy for such rescue robots is the goal of many research efforts. In this work, we propose an algorithm to plan the morphologies of a small rescue robot with four flippers over 3D ground without any extra sensor, such as pressure sensor. To achieve the goal, we simplify the rescue robot as a skeleton on inflated terrain. Its morphology can be represented by configurations of several parameters. Then we plan the mobile movement on 3D terrain with four individually manipulated flippers. We perform real robot experiments on three different obstacles. The results show that we move the flippers very effectively and are thus able to tackle those terrains very well.

ROMay 8, 2019
Configuration-Space Flipper Planning for Rescue Robots

Yijun Yuan, Letong Wang, Sören Schwertfeger

For rescue robots, flipper endows the robot with additional ability to pass through various terrain. Autonomous motion becomes more important. In recent work autonomy is done by either planning with several special states or based on collected data. We are considering if it is possible to find a way to build continues states without collecting old trail data. In this paper, we first model the possible states as a global planning path with parameter configuration of the scene. Then, we follows the path to achieve the autonomous run. We plot the morphology of each path points to show the correctness of the path and implement a simple path following on real robot to demonstrate the performance of our algorithm.

RONov 26, 2018
Fast Gaussian Process Occupancy Maps

Yijun Yuan, Haofei Kuang, Sören Schwertfeger

In this paper, we demonstrate our work on Gaussian Process Occupancy Mapping (GPOM). We concentrate on the inefficiency of the frame computation of the classical GPOM approaches. In robotics, most of the algorithms are required to run in real time. However, the high cost of computation makes the classical GPOM less useful. In this paper we dont try to optimize the Gaussian Process itself, instead, we focus on the application. By analyzing the time cost of each step of the algorithm, we find a way that to reduce the cost while maintaining a good performance compared to the general GPOM framework. From our experiments, we can find that our model enables GPOM to run online and achieve a relatively better quality than the classical GPOM.

RONov 13, 2018
Topological Area Graph Generation and its Application to Path Planning

Jiawei Hou, Yijun Yuan, Sören Schwertfeger

Representing a scanned map of the real environment as a topological structure is an important research in robotics. %is currently an important research. Since topological representations of maps save a huge amount of map storage space and online computing time, they are widely used in fields such as path planning, map matching, and semantic mapping. We propose a novel topological map representation, the Area Graph, in which the vertices represent areas and edges represent passages. The Area Graph is developed from a pruned Voronoi Graph, the Topology Graph. The paper also presents path planning as one application for the Area Graph. For that, we derive a so-called Passage Graph from the Area Graph. Because our algorithm segments the map as a set of areas, the first experiment compares the results of the Area Graph with that of state-of-the-art segmentation approaches, which proved that our method effectively prevented over-segmentation. Then the second experiment shows the superiority of our method over the traditional A* planning algorithm.

RONov 5, 2018
Incrementally Building Topology Graphs via Distance Maps

Yijun Yuan, Sören Schwertfeger

Mapping is an essential task for mobile robots and topological representation often works as a basis for the various applications. In this paper, a novel framework that can build topological maps incrementally is proposed. The algorithm is using a distance map, and in our framework the topological map can grow as we append more sensor data to the map. To demonstrate our algorithm, we show the result of the distance map based method on several popular maps and run the incremental framework with raw sensor data to have a growing topological map, as an example of a robot exploring the environment.