CVAug 8, 2023

LATR: 3D Lane Detection from Monocular Images with Transformer

arXiv:2308.04583v272 citationsh-index: 33Has Code
AI Analysis

This addresses a fundamental challenge in autonomous driving for accurate lane detection, but it appears incremental as it builds on existing transformer and 3D detection methods.

The paper tackles the problem of 3D lane detection from monocular images in autonomous driving, which suffers from depth ambiguity and misalignment issues, and presents LATR, a model that uses 3D-aware front-view features and transformer-based cross-attention to achieve state-of-the-art results, such as an 11.4 F1 score gain on OpenLane.

3D lane detection from monocular images is a fundamental yet challenging task in autonomous driving. Recent advances primarily rely on structural 3D surrogates (e.g., bird's eye view) built from front-view image features and camera parameters. However, the depth ambiguity in monocular images inevitably causes misalignment between the constructed surrogate feature map and the original image, posing a great challenge for accurate lane detection. To address the above issue, we present a novel LATR model, an end-to-end 3D lane detector that uses 3D-aware front-view features without transformed view representation. Specifically, LATR detects 3D lanes via cross-attention based on query and key-value pairs, constructed using our lane-aware query generator and dynamic 3D ground positional embedding. On the one hand, each query is generated based on 2D lane-aware features and adopts a hybrid embedding to enhance lane information. On the other hand, 3D space information is injected as positional embedding from an iteratively-updated 3D ground plane. LATR outperforms previous state-of-the-art methods on both synthetic Apollo, realistic OpenLane and ONCE-3DLanes by large margins (e.g., 11.4 gain in terms of F1 score on OpenLane). Code will be released at https://github.com/JMoonr/LATR .

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes