CVOct 5, 2025Code
TOPO-Bench: An Open-Source Topological Mapping Evaluation Framework with Quantifiable Perceptual AliasingJiaming Wang, Diwen Liu, Jizhuo Chen et al.
Topological mapping offers a compact and robust representation for navigation, but progress in the field is hindered by the lack of standardized evaluation metrics, datasets, and protocols. Existing systems are assessed using different environments and criteria, preventing fair and reproducible comparisons. Moreover, a key challenge - perceptual aliasing - remains under-quantified, despite its strong influence on system performance. We address these gaps by (1) formalizing topological consistency as the fundamental property of topological maps and showing that localization accuracy provides an efficient and interpretable surrogate metric, and (2) proposing the first quantitative measure of dataset ambiguity to enable fair comparisons across environments. To support this protocol, we curate a diverse benchmark dataset with calibrated ambiguity levels, implement and release deep-learned baseline systems, and evaluate them alongside classical methods. Our experiments and analysis yield new insights into the limitations of current approaches under perceptual aliasing. All datasets, baselines, and evaluation tools are fully open-sourced to foster consistent and reproducible research in topological mapping.
ROMay 4
Change-Robust Online Spatial-Semantic Topological MappingJiaming Wang, Jizhuo Chen, Diwen Liu et al.
Autonomous robots require change-robust spatial-semantic reasoning: using spatial and semantic knowledge to decide where to go, how to get there, and where the robot is despite environmental change. Existing approaches typically attach semantics to SLAM-built metric maps, but these pipelines are brittle under appearance shifts and scene dynamics, where data association and relocalization degrade. We propose a Change-Robust Online Spatial-Semantic (CROSS) representation that replaces a globally consistent metric substrate with an online, pose-aware topological graph of RGB-D keyframes. The system explicitly reasons over perceptual ambiguity using sequential hypothesis testing in continuous SE(3). Our estimator maintains a bounded Gaussian-mixture belief over poses, enabling principled handling of loop closures and kidnapped-robot events. Experiments under severe appearance change, including real-robot object-goal navigation with lighting shifts and furniture rearrangement, demonstrate improved robustness over SLAM-based and topological baselines while remaining safe under perceptual aliasing.
CLDec 12, 2024
Dipper: Diversity in Prompts for Producing Large Language Model Ensembles in Reasoning tasksGregory Kang Ruey Lau, Wenyang Hu, Diwen Liu et al.
Large Language Models (LLMs), particularly smaller variants, still struggle with complex reasoning tasks. While inference-time prompting can guide reasoning, existing methods often rely on sequential queries. Ensemble approaches offer a promising path to performance gains, especially given recent batch inference speed-ups. This work introduces DIPPER, a novel, training-free framework that transforms a single LLM into an effective inference-time ensemble. By feeding the model an optimized and diverse set of prompts in parallel, DIPPER elicits varied reasoning paths, leading to performance gains. We empirically demonstrate significant improvements on reasoning benchmarks, such as MATH, where a DIPPER ensemble of three Qwen2-MATH-1.5B instances (via parallel prompting of a single model) outperforms a larger 7B model.
ROJun 22, 2025
GeNIE: A Generalizable Navigation System for In-the-Wild EnvironmentsJiaming Wang, Diwen Liu, Jizhuo Chen et al.
Reliable navigation in unstructured, real-world environments remains a significant challenge for embodied agents, especially when operating across diverse terrains, weather conditions, and sensor configurations. In this paper, we introduce GeNIE (Generalizable Navigation System for In-the-Wild Environments), a robust navigation framework designed for global deployment. GeNIE integrates a generalizable traversability prediction model built on SAM2 with a novel path fusion strategy that enhances planning stability in noisy and ambiguous settings. We deployed GeNIE in the Earth Rover Challenge (ERC) at ICRA 2025, where it was evaluated across six countries spanning three continents. GeNIE took first place and achieved 79% of the maximum possible score, outperforming the second-best team by 17%, and completed the entire competition without a single human intervention. These results set a new benchmark for robust, generalizable outdoor robot navigation. We will release the codebase, pretrained model weights, and newly curated datasets to support future research in real-world navigation.
ROSep 15, 2025
GBPP: Grasp-Aware Base Placement Prediction for Robots via Two-Stage LearningJizhuo Chen, Diwen Liu, Jiaming Wang et al.
GBPP is a fast learning based scorer that selects a robot base pose for grasping from a single RGB-D snapshot. The method uses a two stage curriculum: (1) a simple distance-visibility rule auto-labels a large dataset at low cost; and (2) a smaller set of high fidelity simulation trials refines the model to match true grasp outcomes. A PointNet++ style point cloud encoder with an MLP scores dense grids of candidate poses, enabling rapid online selection without full task-and-motion optimization. In simulation and on a real mobile manipulator, GBPP outperforms proximity and geometry only baselines, choosing safer and more reachable stances and degrading gracefully when wrong. The results offer a practical recipe for data efficient, geometry aware base placement: use inexpensive heuristics for coverage, then calibrate with targeted simulation.