CVSep 19, 2024
LMT-Net: Lane Model Transformer Network for Automated HD Mapping from Sparse Vehicle ObservationsMichael Mink, Thomas Monninger, Steffen Staab
In autonomous driving, High Definition (HD) maps provide a complete lane model that is not limited by sensor range and occlusions. However, the generation and upkeep of HD maps involves periodic data collection and human annotations, limiting scalability. To address this, we investigate automating the lane model generation and the use of sparse vehicle observations instead of dense sensor measurements. For our approach, a pre-processing step generates polylines by aligning and aggregating observed lane boundaries. Aligned driven traces are used as starting points for predicting lane pairs defined by the left and right boundary points. We propose Lane Model Transformer Network (LMT-Net), an encoder-decoder neural network architecture that performs polyline encoding and predicts lane pairs and their connectivity. A lane graph is formed by using predicted lane pairs as nodes and predicted lane connectivity as edges. We evaluate the performance of LMT-Net on an internal dataset that consists of multiple vehicle observations as well as human annotations as Ground Truth (GT). The evaluation shows promising results and demonstrates superior performance compared to the implemented baseline on both highway and non-highway Operational Design Domain (ODD).
42.1CVApr 27
ARETE: Attention-based Rasterized Encoding for Topology Estimation using HSV-transformed Crowdsourced Vehicle Fleet DataDaniel Fritz, Dimitrios Lagamtzis, Michael Mink et al.
The continuous advancement of autonomous driving (AD) introduces challenges across multiple disciplines to ensure safe and efficient driving. One such challenge is the generation of High-Definition (HD) maps, which must remain up to date and highly accurate for downstream automotive tasks. One promising approach is the use of crowdsourced data from a vehicle fleet, representing road topology and lane-level features. This work focuses on the generation of centerlines and lane dividers from crowdsourced vehicle trajectories. We adopt a Detection Transformer (DETR)-based approach, where a rasterized representation of vehicle trajectories is used as input to predict vectorized lane representations. Each lane consists of a centerline with an associated direction and corresponding lane dividers that are geometrically constrained by the centerline. Our method includes the extraction of local tiles, from which crowdsourced vehicle trajectories are aggregated. Each tile undergoes a transformation into a rasterized representation encoding both the presence and direction of each trajectory, enabling the prediction of vectorized directed lanes. Experiments are conducted on an internal dataset as well as on the public datasets nuScenes and nuPlan.