Zezhou Sun

CV
h-index12
4papers
288citations
Novelty29%
AI Score29

4 Papers

CVMar 18, 2025Code
Can Large Vision Language Models Read Maps Like a Human?

Shuo Xing, Zezhou Sun, Shuangyu Xie et al.

In this paper, we introduce MapBench-the first dataset specifically designed for human-readable, pixel-based map-based outdoor navigation, curated from complex path finding scenarios. MapBench comprises over 1600 pixel space map path finding problems from 100 diverse maps. In MapBench, LVLMs generate language-based navigation instructions given a map image and a query with beginning and end landmarks. For each map, MapBench provides Map Space Scene Graph (MSSG) as an indexing data structure to convert between natural language and evaluate LVLM-generated results. We demonstrate that MapBench significantly challenges state-of-the-art LVLMs both zero-shot prompting and a Chain-of-Thought (CoT) augmented reasoning framework that decomposes map navigation into sequential cognitive processes. Our evaluation of both open-source and closed-source LVLMs underscores the substantial difficulty posed by MapBench, revealing critical limitations in their spatial reasoning and structured decision-making capabilities. We release all the code and dataset in https://github.com/taco-group/MapBench.

CVSep 7, 2020
Frontier Detection and Reachability Analysis for Efficient 2D Graph-SLAM Based Active Exploration

Zezhou Sun, Banghe Wu, Cheng-Zhong Xu et al.

We propose an integrated approach to active exploration by exploiting the Cartographer method as the base SLAM module for submap creation and performing efficient frontier detection in the geometrically co-aligned submaps induced by graph optimization. We also carry out analysis on the reachability of frontiers and their clusters to ensure that the detected frontier can be reached by robot. Our method is tested on a mobile robot in real indoor scene to demonstrate the effectiveness and efficiency of our approach.

RODec 9, 2019
LiDAR Iris for Loop-Closure Detection

Ying Wang, Zezhou Sun, Cheng-Zhong Xu et al.

In this paper, a global descriptor for a LiDAR point cloud, called LiDAR Iris, is proposed for fast and accurate loop-closure detection. A binary signature image can be obtained for each point cloud after several LoG-Gabor filtering and thresholding operations on the LiDAR-Iris image representation. Given two point clouds, their similarities can be calculated as the Hamming distance of two corresponding binary signature images extracted from the two point clouds, respectively. Our LiDAR-Iris method can achieve a pose-invariant loop-closure detection at a descriptor level with the Fourier transform of the LiDAR-Iris representation if assuming a 3D (x,y,yaw) pose space, although our method can generally be applied to a 6D pose space by re-aligning point clouds with an additional IMU sensor. Experimental results on five road-scene sequences demonstrate its excellent performance in loop-closure detection.

ROJun 20, 2019
PuzzleFlex: kinematic motion of chains with loose joints

Samuel Lensgraf, Karim Itani, Yinan Zhang et al.

This paper presents a method of computing free motions of a planar assembly of rigid bodies connected by loose joints. Joints are modeled using local distance constraints, which are then linearized with respect to configuration space velocities, yielding a linear programming formulation that allows analysis of systems with thousands of rigid bodies. Potential applications include analysis of collections of modular robots, structural stability perturbation analysis, tolerance analysis for mechanical systems, and formation control of mobile robots.