EgoLocate: Real-time Motion Capture, Localization, and Mapping with Sparse Body-mounted Sensors
This addresses the challenge of accurate real-time localization for applications like robotics and AR/VR, though it is incremental by combining existing techniques.
The paper tackles the problem of simultaneous human motion capture and environment mapping by integrating inertial sensors with a monocular camera, resulting in improved localization compared to state-of-the-art methods in both fields.
Human and environment sensing are two important topics in Computer Vision and Graphics. Human motion is often captured by inertial sensors, while the environment is mostly reconstructed using cameras. We integrate the two techniques together in EgoLocate, a system that simultaneously performs human motion capture (mocap), localization, and mapping in real time from sparse body-mounted sensors, including 6 inertial measurement units (IMUs) and a monocular phone camera. On one hand, inertial mocap suffers from large translation drift due to the lack of the global positioning signal. EgoLocate leverages image-based simultaneous localization and mapping (SLAM) techniques to locate the human in the reconstructed scene. On the other hand, SLAM often fails when the visual feature is poor. EgoLocate involves inertial mocap to provide a strong prior for the camera motion. Experiments show that localization, a key challenge for both two fields, is largely improved by our technique, compared with the state of the art of the two fields. Our codes are available for research at https://xinyu-yi.github.io/EgoLocate/.