CVROMar 21, 2022

Self-Supervised Road Layout Parsing with Graph Auto-Encoding

arXiv:2203.11000v21 citationsh-index: 19
Originality Incremental advance
AI Analysis

This addresses the problem of automated road topology parsing for autonomous driving systems, though it is incremental as it builds on existing auto-encoder and graph representation methods.

The paper tackles road layout understanding by predicting human-interpretable graphs from bird's-eye-view maps using a self-supervised image-graph-image auto-encoder, achieving comparable performance to a fully-supervised baseline on the Argoverse dataset.

Aiming for higher-level scene understanding, this work presents a neural network approach that takes a road-layout map in bird's-eye-view as input, and predicts a human-interpretable graph that represents the road's topological layout. Our approach elevates the understanding of road layouts from pixel level to the level of graphs. To achieve this goal, an image-graph-image auto-encoder is utilized. The network is designed to learn to regress the graph representation at its auto-encoder bottleneck. This learning is self-supervised by an image reconstruction loss, without needing any external manual annotations. We create a synthetic dataset containing common road layout patterns and use it for training of the auto-encoder in addition to the real-world Argoverse dataset. By using this additional synthetic dataset, which conceptually captures human knowledge of road layouts and makes this available to the network for training, we are able to stabilize and further improve the performance of topological road layout understanding on the real-world Argoverse dataset. The evaluation shows that our approach exhibits comparable performance to a strong fully-supervised baseline.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes