CVSep 11, 2018

Long-Term Occupancy Grid Prediction Using Recurrent Neural Networks

arXiv:1809.03782v279 citations
Originality Incremental advance
AI Analysis

This addresses the problem of long-term scene prediction for automated driving systems, representing an incremental improvement over existing approaches.

The paper tackles long-term prediction of scene evolution in complex downtown driving scenarios by using recurrent neural networks with Lidar grid fusion to predict future occupancy grids, achieving improved prediction of occluded objects and multimodal future paths compared to previous methods.

We tackle the long-term prediction of scene evolution in a complex downtown scenario for automated driving based on Lidar grid fusion and recurrent neural networks (RNNs). A bird's eye view of the scene, including occupancy and velocity, is fed as a sequence to a RNN which is trained to predict future occupancy. The nature of prediction allows generation of multiple hours of training data without the need of manual labeling. Thus, the training strategy and loss function is designed for long sequences of real-world data (unbalanced, continuously changing situations, false labels, etc.). The deep CNN architecture comprises convolutional long short-term memories (ConvLSTMs) to separate static from dynamic regions and to predict dynamic objects in future frames. Novel recurrent skip connections show the ability to predict small occluded objects, i.e. pedestrians, and occluded static regions. Spatio-temporal correlations between grid cells are exploited to predict multimodal future paths and interactions between objects. Experiments also quantify improvements to our previous network, a Monte Carlo approach, and literature.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes