ROMar 7, 2023
Deep Learning for Inertial Positioning: A SurveyChanghao Chen, Xianfei Pan
Inertial sensors are widely utilized in smartphones, drones, robots, and IoT devices, playing a crucial role in enabling ubiquitous and reliable localization. Inertial sensor-based positioning is essential in various applications, including personal navigation, location-based security, and human-device interaction. However, low-cost MEMS inertial sensors' measurements are inevitably corrupted by various error sources, leading to unbounded drifts when integrated doubly in traditional inertial navigation algorithms, subjecting inertial positioning to the problem of error drifts. In recent years, with the rapid increase in sensor data and computational power, deep learning techniques have been developed, sparking significant research into addressing the problem of inertial positioning. Relevant literature in this field spans across mobile computing, robotics, and machine learning. In this article, we provide a comprehensive review of deep learning-based inertial positioning and its applications in tracking pedestrians, drones, vehicles, and robots. We connect efforts from different fields and discuss how deep learning can be applied to address issues such as sensor calibration, positioning error drift reduction, and multi-sensor fusion. This article aims to attract readers from various backgrounds, including researchers and practitioners interested in the potential of deep learning-based techniques to solve inertial positioning problems. Our review demonstrates the exciting possibilities that deep learning brings to the table and provides a roadmap for future research in this field.
CVSep 18, 2022
EMA-VIO: Deep Visual-Inertial Odometry with External Memory AttentionZheming Tu, Changhao Chen, Xianfei Pan et al.
Accurate and robust localization is a fundamental need for mobile agents. Visual-inertial odometry (VIO) algorithms exploit the information from camera and inertial sensors to estimate position and translation. Recent deep learning based VIO models attract attentions as they provide pose information in a data-driven way, without the need of designing hand-crafted algorithms. Existing learning based VIO models rely on recurrent models to fuse multimodal data and process sensor signal, which are hard to train and not efficient enough. We propose a novel learning based VIO framework with external memory attention that effectively and efficiently combines visual and inertial features for states estimation. Our proposed model is able to estimate pose accurately and robustly, even in challenging scenarios, e.g., on overcast days and water-filled ground , which are difficult for traditional VIO algorithms to extract visual features. Experiments validate that it outperforms both traditional and learning based VIO baselines in different scenes.
CVNov 16, 2022
SelfOdom: Self-supervised Egomotion and Depth Learning via Bi-directional Coarse-to-Fine Scale RecoveryHao Qu, Lilian Zhang, Xiaoping Hu et al.
Accurately perceiving location and scene is crucial for autonomous driving and mobile robots. Recent advances in deep learning have made it possible to learn egomotion and depth from monocular images in a self-supervised manner, without requiring highly precise labels to train the networks. However, monocular vision methods suffer from a limitation known as scale-ambiguity, which restricts their application when absolute-scale is necessary. To address this, we propose SelfOdom, a self-supervised dual-network framework that can robustly and consistently learn and generate pose and depth estimates in global scale from monocular images. In particular, we introduce a novel coarse-to-fine training strategy that enables the metric scale to be recovered in a two-stage process. Furthermore, SelfOdom is flexible and can incorporate inertial data with images, which improves its robustness in challenging scenarios, using an attention-based fusion module. Our model excels in both normal and challenging lighting conditions, including difficult night scenes. Extensive experiments on public datasets have demonstrated that SelfOdom outperforms representative traditional and learning-based VO and VIO models.
CVJun 23, 2025
ThermalLoc: A Vision Transformer-Based Approach for Robust Thermal Camera Relocalization in Large-Scale EnvironmentsYu Liu, Yangtao Meng, Xianfei Pan et al.
Thermal cameras capture environmental data through heat emission, a fundamentally different mechanism compared to visible light cameras, which rely on pinhole imaging. As a result, traditional visual relocalization methods designed for visible light images are not directly applicable to thermal images. Despite significant advancements in deep learning for camera relocalization, approaches specifically tailored for thermal camera-based relocalization remain underexplored. To address this gap, we introduce ThermalLoc, a novel end-to-end deep learning method for thermal image relocalization. ThermalLoc effectively extracts both local and global features from thermal images by integrating EfficientNet with Transformers, and performs absolute pose regression using two MLP networks. We evaluated ThermalLoc on both the publicly available thermal-odometry dataset and our own dataset. The results demonstrate that ThermalLoc outperforms existing representative methods employed for thermal camera relocalization, including AtLoc, MapNet, PoseNet, and RobustLoc, achieving superior accuracy and robustness.
ROSep 4, 2023
ReLoc-PDR: Visual Relocalization Enhanced Pedestrian Dead Reckoning via Graph OptimizationZongyang Chen, Xianfei Pan, Changhao Chen
Accurately and reliably positioning pedestrians in satellite-denied conditions remains a significant challenge. Pedestrian dead reckoning (PDR) is commonly employed to estimate pedestrian location using low-cost inertial sensor. However, PDR is susceptible to drift due to sensor noise, incorrect step detection, and inaccurate stride length estimation. This work proposes ReLoc-PDR, a fusion framework combining PDR and visual relocalization using graph optimization. ReLoc-PDR leverages time-correlated visual observations and learned descriptors to achieve robust positioning in visually-degraded environments. A graph optimization-based fusion mechanism with the Tukey kernel effectively corrects cumulative errors and mitigates the impact of abnormal visual observations. Real-world experiments demonstrate that our ReLoc-PDR surpasses representative methods in accuracy and robustness, achieving accurte and robust pedestrian positioning results using only a smartphone in challenging environments such as less-textured corridors and dark nighttime scenarios.
ROSep 7, 2015
Underwater Doppler Navigation with Self-calibrationXianfei Pan, Yuanxin Wu
Precise autonomous navigation remains a substantial challenge to all underwater platforms. Inertial Measurement Units (IMU) and Doppler Velocity Logs (DVL) have complementary characteristics and are promising sensors that could enable fully autonomous underwater navigation in unexplored areas without relying on additional external Global Positioning System (GPS) or acoustic beacons. This paper addresses the combined IMU/DVL navigation system from the viewpoint of observability. We show by analysis that under moderate conditions the combined system is observable. Specifically, the DVL parameters, including the scale factor and misalignment angles, can be calibrated in-situ without using external GPS or acoustic beacon sensors. Simulation results using a practical estimator validate the analytic conclusions.
ROJul 6, 2012
Velocity/Position Integration Formula (II): Application to Inertial Navigation ComputationYuanxin Wu, Xianfei Pan
Inertial navigation applications are usually referenced to a rotating frame. Consideration of the navigation reference frame rotation in the inertial navigation algorithm design is an important but so far less seriously treated issue, especially for ultra-high-speed flying aircraft or the future ultra-precision navigation system of several meters per hour. This paper proposes a rigorous approach to tackle the issue of navigation frame rotation in velocity/position computation by use of the newly-devised velocity/position integration formulae in the Part I companion paper. The two integration formulae set a well-founded cornerstone for the velocity/position algorithms design that makes the comprehension of the inertial navigation computation principle more accessible to practitioners, and different approximations to the integrals involved will give birth to various velocity/position update algorithms. Two-sample velocity and position algorithms are derived to exemplify the design process. In the context of level-flight airplane examples, the derived algorithm is analytically and numerically compared to the typical algorithms existing in the literature. The results throw light on the problems in existing algorithms and the potential benefits of the derived algorithm.
ROJul 6, 2012
Velocity/Position Integration Formula (I): Application to In-flight Coarse AlignmentYuanxin Wu, Xianfei Pan
The in-flight alignment is a critical stage for airborne INS/GPS applications. The alignment task is usually carried out by the Kalman filtering technique that necessitates a good initial attitude to obtain satisfying performance. Due to the airborne dynamics, the in-flight alignment is much difficult than alignment on the ground. This paper proposes an optimization-based coarse alignment approach using GPS position/velocity as input, founded on the newly-derived velocity/position integration formulae. Simulation and flight test results show that, with the GPS lever arm well handled, it is potentially able to yield the initial heading up to one degree accuracy in ten seconds. It can serve as a nice coarse in-flight alignment without any prior attitude information for the subsequent fine Kalman alignment. The approach can also be applied to other applications that require aligning the INS on the run.