A Monocular SLAM-based Multi-User Positioning System with Image Occlusion in Augmented Reality
This addresses the problem of enabling collaborative AR experiences for multiple users, though it is incremental as it builds on existing SLAM and deep learning methods.
The paper tackles the challenge of multi-user spatial localization and synchronization in augmented reality by proposing a system based on ORB-SLAM2 and Unity 3D, which uses monocular RGB images and virtual objects as reference points, achieving improved occlusion handling through deep learning-based depth estimation.
In recent years, with the rapid development of augmented reality (AR) technology, there is an increasing demand for multi-user collaborative experiences. Unlike for single-user experiences, ensuring the spatial localization of every user and maintaining synchronization and consistency of positioning and orientation across multiple users is a significant challenge. In this paper, we propose a multi-user localization system based on ORB-SLAM2 using monocular RGB images as a development platform based on the Unity 3D game engine. This system not only performs user localization but also places a common virtual object on a planar surface (such as table) in the environment so that every user holds a proper perspective view of the object. These generated virtual objects serve as reference points for multi-user position synchronization. The positioning information is passed among every user's AR devices via a central server, based on which the relative position and movement of other users in the space of a specific user are presented via virtual avatars all with respect to these virtual objects. In addition, we use deep learning techniques to estimate the depth map of an image from a single RGB image to solve occlusion problems in AR applications, making virtual objects appear more natural in AR scenes.