32.9ROMay 17
ORION: Option-Regularized Deep Reinforcement Learning for Cooperative Multi-Agent Online NavigationShizhe Zhang, Jingsong Liang, Zhitao Zhou et al.
Existing methods for multi-agent navigation typically assume fully known environments, offering limited support for partially known scenarios with outdated or imperfect prior maps, such as warehouses or factory floors. There, agents need to balance path optimality with collecting and sharing environmental information to help teammates reach their own targets. To these ends, we propose ORION, a novel deep reinforcement learning framework for cooperative multi-agent online navigation in partially known environments. Starting from an imperfect prior map, ORION trains agents to make decentralized decisions, coordinate toward individual targets, and actively reduce task-relevant map uncertainty through online observation sharing in a closed perception-action loop. We first design a shared graph encoder that fuses prior map with online perception into a unified representation, providing robust state embeddings under environmental discrepancies. At the core of ORION is an option-critic framework that learns high-level cooperative modes translated into sequences of low-level actions, enabling adaptive switching between individual navigation and team-level exploration. We further introduce a dual-stage cooperation strategy that allows agents to assist teammates under map uncertainty, thereby reducing the overall makespan. Across extensive maze-like maps and large-scale warehouse environments, ORION achieves high-quality real-time decentralized cooperation while scaling to up to 10 robots, outperforming state-of-the-art classical and learning-based baselines. Finally, we validate ORION on physical robot teams, demonstrating its robustness and practicality for real-world cooperative navigation.
89.3ROMar 14
ImagiNav: Scalable Embodied Navigation via Generative Visual Prediction and Inverse DynamicsJie Chen, Yuxin Cai, Yizhuo Wang et al.
Enabling robots to navigate open-world environments via natural language is critical for general-purpose autonomy. Yet, Vision-Language Navigation has relied on end-to-end policies trained on expensive, embodiment-specific robot data. While recent foundation models trained on vast simulation data show promise, the challenge of scaling and generalizing due to the limited scene diversity and visual fidelity in simulation persists. To address this gap, we propose ImagiNav, a novel modular paradigm that decouples visual planning from robot actuation, enabling the direct utilization of diverse in-the-wild navigation videos. Our framework operates as a hierarchy: a Vision-Language Model first decomposes instructions into textual subgoals; a finetuned generative video model then imagines the future video trajectory towards that subgoal; finally, an inverse dynamics model extracts the trajectory from the imagined video, which can then be tracked via a low-level controller. We additionally develop a scalable data pipeline of in-the-wild navigation videos auto-labeled via inverse dynamics and a pretrained Vision-Language Model. ImagiNav demonstrates strong zero-shot transfer to robot navigation without requiring robot demonstrations, paving the way for generalist robots that learn navigation directly from unlabeled, open-world data.
LGAug 20, 2021Code
DL-Traff: Survey and Benchmark of Deep Learning Models for Urban Traffic PredictionRenhe Jiang, Du Yin, Zhaonan Wang et al.
Nowadays, with the rapid development of IoT (Internet of Things) and CPS (Cyber-Physical Systems) technologies, big spatiotemporal data are being generated from mobile phones, car navigation systems, and traffic sensors. By leveraging state-of-the-art deep learning technologies on such data, urban traffic prediction has drawn a lot of attention in AI and Intelligent Transportation System community. The problem can be uniformly modeled with a 3D tensor (T, N, C), where T denotes the total time steps, N denotes the size of the spatial domain (i.e., mesh-grids or graph-nodes), and C denotes the channels of information. According to the specific modeling strategy, the state-of-the-art deep learning models can be divided into three categories: grid-based, graph-based, and multivariate time-series models. In this study, we first synthetically review the deep traffic models as well as the widely used datasets, then build a standard benchmark to comprehensively evaluate their performances with the same settings and metrics. Our study named DL-Traff is implemented with two most popular deep learning frameworks, i.e., TensorFlow and PyTorch, which is already publicly available as two GitHub repositories https://github.com/deepkashiwa20/DL-Traff-Grid and https://github.com/deepkashiwa20/DL-Traff-Graph. With DL-Traff, we hope to deliver a useful resource to researchers who are interested in spatiotemporal data analysis.
71.5ITApr 23
Robust Beamforming for MIMO Radar with Imperfect Prior Distribution InformationYizhuo Wang, Shuowen Zhang
This paper studies a multiple-input multiple-output (MIMO) radar system for sensing the unknown and random angular location (angle) of a point target, based on the target-reflected echo signals and known prior distribution information about the target's angle specified by a probability density function (PDF). We consider a challenging yet practical scenario where the knowledge of such PDF is imperfect, due to the inaccuracy in PDF acquisition or unpredicted change of target appearance pattern; while the real (actual) PDF is modeled as an unknown perturbed version of the imperfect known PDF bounded by a given uncertainty radius. Such PDF imperfection motivates us to study the robust transmit beamforming design to optimize the worst-case sensing performance among all possible real PDFs. Since the sensing mean-squared error (MSE) is difficult to be characterized explicitly, we adopt the worst-case posterior Cramér-Rao bound (PCRB) as the performance metric. We formulate the beamforming optimization problem to minimize the maximum PCRB among all possible real PDFs, which is highly non-trivial since the PCRB has a complex intractable expression over the real PDF, and there are infinite constraints corresponding to the continuous set of real PDFs bounded by the uncertainty radius. To address these challenges, we derive a tractable quadratic approximation of the PCRB via second-order Taylor expansion, and leverage the S-procedure to equivalently transform the infinite constraints into a linear matrix inequality, based on which the problem is reformulated into a convex optimization problem solvable with polynomial time complexity. The obtained solution approaches the globally optimal robust beamforming solution as the uncertainty radius decreases. Numerical results validate the effectiveness of our proposed robust beamforming design.