Shishir Gopinath

RO
3papers
3citations
Novelty52%
AI Score43

3 Papers

ROMar 16
TurboMap: GPU-Accelerated Local Mapping for Visual SLAM

Parsa Hosseininejad, Kimia Khabiri, Shishir Gopinath et al.

In real-time Visual SLAM systems, local mapping must operate under strict latency constraints, as delays degrade map quality and increase the risk of tracking failure. GPU parallelization offers a promising way to reduce latency. However, parallelizing local mapping is challenging due to synchronized shared-state updates and the overhead of transferring large map data structures to the GPU. This paper presents TurboMap, a GPU-parallelized and CPU-optimized local mapping backend that holistically addresses these challenges. We restructure Map Point Creation to enable parallel Keypoint Correspondence Search on the GPU, redesign and parallelize Map Point Fusion, optimize Redundant Keyframe Culling on the CPU, and integrate a fast GPU-based Local Bundle Adjustment solver. To minimize data transfer and synchronization costs, we introduce persistent GPU-resident keyframe storage. Experiments on the EuRoC and TUM-VI datasets show average local mapping speedups of 1.3x and 1.6x, respectively, while preserving accuracy.

ROMar 14
Graphite: A GPU-Accelerated Mixed-Precision Graph Optimization Framework

Shishir Gopinath, Karthik Dantu, Steven Y. Ko

We present Graphite, a GPU-accelerated nonlinear least squares graph optimization framework. It provides a CUDA C++ interface to enable the sharing of code between a real-time application, such as a SLAM system, and its optimization tasks. The framework supports techniques to reduce memory usage, including in-place optimization, support for multiple floating point types and mixed-precision modes, and dynamically computed Jacobians. We evaluate Graphite on well-known bundle adjustment problems and find that it achieves similar performance to MegBA, a solver specialized for bundle adjustment, while maintaining generality and using less memory. We also apply Graphite to global visual-inertial bundle adjustment on maps generated from stereo-inertial SLAM datasets, and observe speed-ups of up to 59x compared to a CPU baseline. Our results indicate that our framework enables faster large-scale optimization on both desktop and resource-constrained devices.

ROMar 17
FastLoop: Parallel Loop Closing with GPU-Acceleration in Visual SLAM

Soudabeh Mohammadhashemi, Shishir Gopinath, Kimia Khabiri et al.

Visual SLAM systems combine visual tracking with global loop closure to maintain a consistent map and accurate localization. Loop closure is a computationally expensive process as we need to search across the whole map for matches. This paper presents FastLoop, a GPU-accelerated loop closing module to alleviate this computational complexity. We identify key performance bottlenecks in the loop closing pipeline of visual SLAM and address them through parallel optimizations on the GPU. Specifically, we use task-level and data-level parallelism and integrate a GPU-accelerated pose graph optimization. Our implementation is built on top of ORB-SLAM3 and leverages CUDA for GPU programming. Experimental results show that FastLoop achieves an average speedup of 1.4x and 1.3x on the EuRoC dataset and 3.0x and 2.4x on the TUM-VI dataset for the loop closing module on desktop and embedded platforms, respectively, while maintaining the accuracy of the original system.