15.9SYMar 20
On the Capacity of Future Lane-Free Urban InfrastructurePatrick Malcolm, Klaus Bogenberger
In this paper, the potential capacity and spatial efficiency of future autonomous lane-free traffic in urban environments are explored using a combination of analytical and simulation-based approaches. For lane-free roadways, a simple analytical approach is employed, which shows not only that lane-free traffic offers a higher capacity than lane-based traffic for the same street width, but also that the relationship between capacity and street width is continuous under lane-free traffic. To test the potential capacity and properties of lane-free signal-free intersections (automated intersection management), two approaches were simulated and compared, including a novel approach which we call OptWULF. This approach uses a multi-agent conflict-based search approach with a low-level planner that uses a combination of optimization and simple window-based reservation. With these simulations, we confirm the continuous relationship between capacity and street width for intersection scenarios. We also show that OptWULF results in an even utilization of the entire drivable area of the street and intersection area. Furthermore, we show that OptWULF is capable of handling asymmetric demand patterns without any substantial loss in capacity compared to symmetric demand patterns.
25.7CVMay 12
Learning Ego-Centric BEV Representations from a Perspective-Privileged View: Cross-View Supervision for Online HD Map ConstructionDaniel Lengerer, Mathias Pechinger, Klaus Bogenberger et al.
Bird's-eye-view (BEV) representations derived from multi-camera input have become a central interface for online high-definition (HD) map construction. However, most approaches rely solely on ego-centric supervision, requiring large-scale scene structure to be inferred from incomplete observations, occlusions, and diminishing information density at long range, where perspective effects and spatial sparsity hinder consistent structural reasoning. We introduce Cross-View Supervision (CVS), a representation learning paradigm that transfers geometric and topological priors from an ego-aligned overhead perspective into camera-based BEV encoders. Rather than adding auxiliary semantic losses, CVS aligns representations in a shared BEV feature space and distills globally consistent structural knowledge from a perspective-privileged teacher into the ego-centric backbone. This supervision enhances structural coherence without modifying the inference architecture or requiring overhead input at test time. Experiments on nuScenes using ego-aligned aerial imagery from the AID4AD cross-view extension demonstrate consistent improvements over StreamMapNet while maintaining identical camera-only inference. CVS yields +3.9\,mAP in the standard $60\times30\,\mathrm{m}$ region and +9.9\,mAP in the extended $100\times50\,\mathrm{m}$ setting, corresponding to a 44\% relative gain at long range. These results highlight perspective-privileged structural supervision as a promising training principle for improving BEV representation learning in HD map construction.
CVAug 4, 2025Code
AID4AD: Aerial Image Data for Automated Driving PerceptionDaniel Lengerer, Mathias Pechinger, Klaus Bogenberger et al.
This work investigates the integration of spatially aligned aerial imagery into perception tasks for automated vehicles (AVs). As a central contribution, we present AID4AD, a publicly available dataset that augments the nuScenes dataset with high-resolution aerial imagery precisely aligned to its local coordinate system. The alignment is performed using SLAM-based point cloud maps provided by nuScenes, establishing a direct link between aerial data and nuScenes local coordinate system. To ensure spatial fidelity, we propose an alignment workflow that corrects for localization and projection distortions. A manual quality control process further refines the dataset by identifying a set of high-quality alignments, which we publish as ground truth to support future research on automated registration. We demonstrate the practical value of AID4AD in two representative tasks: in online map construction, aerial imagery serves as a complementary input that improves the mapping process; in motion prediction, it functions as a structured environmental representation that replaces high-definition maps. Experiments show that aerial imagery leads to a 15-23% improvement in map construction accuracy and a 2% gain in trajectory prediction performance. These results highlight the potential of aerial imagery as a scalable and adaptable source of environmental context in automated vehicle systems, particularly in scenarios where high-definition maps are unavailable, outdated, or costly to maintain. AID4AD, along with evaluation code and pretrained models, is publicly released to foster further research in this direction: https://github.com/DriverlessMobility/AID4AD.
LGSep 2, 2025
Semi-on-Demand Transit Feeders with Shared Autonomous Vehicles and Reinforcement-Learning-Based Zonal Dispatching ControlMax T. M. Ng, Roman Engelhardt, Florian Dandl et al.
This paper develops a semi-on-demand transit feeder service using shared autonomous vehicles (SAVs) and zonal dispatching control based on reinforcement learning (RL). This service combines the cost-effectiveness of fixed-route transit with the adaptability of demand-responsive transport to improve accessibility in lower-density areas. Departing from the terminus, SAVs first make scheduled fixed stops, then offer on-demand pick-ups and drop-offs in a pre-determined flexible-route area. Our deep RL model dynamically assigns vehicles to subdivided flexible-route zones in response to real-time demand fluctuations and operations, using a policy gradient algorithm - Proximal Policy Optimization. The methodology is demonstrated through agent-based simulations on a real-world bus route in Munich, Germany. Results show that after efficient training of the RL model, the semi-on-demand service with dynamic zonal control serves 16% more passengers at 13% higher generalized costs on average compared to traditional fixed-route service. The efficiency gain brought by RL control brings 2.4% more passengers at 1.4% higher costs. This study not only showcases the potential of integrating SAV feeders and machine learning techniques into public transit, but also sets the groundwork for further innovations in addressing first-mile-last-mile problems in multimodal transit systems.
CVMay 12, 2025
TUM2TWIN: Introducing the Large-Scale Multimodal Urban Digital Twin Benchmark DatasetOlaf Wysocki, Benedikt Schwab, Manoj Kumar Biswanath et al.
Urban Digital Twins (UDTs) have become essential for managing cities and integrating complex, heterogeneous data from diverse sources. Creating UDTs involves challenges at multiple process stages, including acquiring accurate 3D source data, reconstructing high-fidelity 3D models, maintaining models' updates, and ensuring seamless interoperability to downstream tasks. Current datasets are usually limited to one part of the processing chain, hampering comprehensive UDTs validation. To address these challenges, we introduce the first comprehensive multimodal Urban Digital Twin benchmark dataset: TUM2TWIN. This dataset includes georeferenced, semantically aligned 3D models and networks along with various terrestrial, mobile, aerial, and satellite observations boasting 32 data subsets over roughly 100,000 $m^2$ and currently 767 GB of data. By ensuring georeferenced indoor-outdoor acquisition, high accuracy, and multimodal data integration, the benchmark supports robust analysis of sensors and the development of advanced reconstruction methods. Additionally, we explore downstream tasks demonstrating the potential of TUM2TWIN, including novel view synthesis of NeRF and Gaussian Splatting, solar potential analysis, point cloud semantic segmentation, and LoD3 building reconstruction. We are convinced this contribution lays a foundation for overcoming current limitations in UDT creation, fostering new research directions and practical solutions for smarter, data-driven urban environments. The project is available under: https://tum2t.win
CVMar 6, 2024
Temporal Enhanced Floating Car ObserversJeremias Gerner, Klaus Bogenberger, Stefanie Schmidtner
Floating Car Observers (FCOs) are an innovative method to collect traffic data by deploying sensor-equipped vehicles to detect and locate other vehicles. We demonstrate that even a small penetration rate of FCOs can identify a significant amount of vehicles at a given intersection. This is achieved through the emulation of detection within a microscopic traffic simulation. Additionally, leveraging data from previous moments can enhance the detection of vehicles in the current frame. Our findings indicate that, with a 20-second observation window, it is possible to recover up to 20\% of vehicles that are not visible by FCOs in the current timestep. To exploit this, we developed a data-driven strategy, utilizing sequences of Bird's Eye View (BEV) representations of detected vehicles and deep learning models. This approach aims to bring currently undetected vehicles into view in the present moment, enhancing the currently detected vehicles. Results of different spatiotemporal architectures show that up to 41\% of the vehicles can be recovered into the current timestep at their current position. This enhancement enriches the information initially available by the FCO, allowing an improved estimation of traffic states and metrics (e.g. density and queue length) for improved implementation of traffic management strategies.
SOC-PHApr 29, 2025
Floating Car Observers in Intelligent Transportation Systems: Detection Modeling and Temporal InsightsJeremias Gerner, Klaus Bogenberger, Stefanie Schmidtner
Floating Car Observers (FCOs) extend traditional Floating Car Data (FCD) by integrating onboard sensors to detect and localize other traffic participants, providing richer and more detailed traffic data. In this work, we explore various modeling approaches for FCO detections within microscopic traffic simulations to evaluate their potential for Intelligent Transportation System (ITS) applications. These approaches range from 2D raytracing to high-fidelity co-simulations that emulate real-world sensors and integrate 3D object detection algorithms to closely replicate FCO detections. Additionally, we introduce a neural network-based emulation technique that effectively approximates the results of high-fidelity co-simulations. This approach captures the unique characteristics of FCO detections while offering a fast and scalable solution for modeling. Using this emulation method, we investigate the impact of FCO data in a digital twin of a traffic network modeled in SUMO. Results demonstrate that even at a 20% penetration rate, FCOs using LiDAR-based detections can identify 65% of vehicles across various intersections and traffic demand scenarios. Further potential emerges when temporal insights are integrated, enabling the recovery of previously detected but currently unseen vehicles. By employing data-driven methods, we recover over 80% of these vehicles with minimal positional deviations. These findings underscore the potential of FCOs for ITS, particularly in enhancing traffic state estimation and monitoring under varying penetration rates and traffic conditions.
CVJan 25, 2024
Unlocking Past Information: Temporal Embeddings in Cooperative Bird's Eye View PredictionDominik Rößle, Jeremias Gerner, Klaus Bogenberger et al.
Accurate and comprehensive semantic segmentation of Bird's Eye View (BEV) is essential for ensuring safe and proactive navigation in autonomous driving. Although cooperative perception has exceeded the detection capabilities of single-agent systems, prevalent camera-based algorithms in cooperative perception neglect valuable information derived from historical observations. This limitation becomes critical during sensor failures or communication issues as cooperative perception reverts to single-agent perception, leading to degraded performance and incomplete BEV segmentation maps. This paper introduces TempCoBEV, a temporal module designed to incorporate historical cues into current observations, thereby improving the quality and reliability of BEV map segmentations. We propose an importance-guided attention architecture to effectively integrate temporal information that prioritizes relevant properties for BEV map segmentation. TempCoBEV is an independent temporal module that seamlessly integrates into state-of-the-art camera-based cooperative perception models. We demonstrate through extensive experiments on the OPV2V dataset that TempCoBEV performs better than non-temporal models in predicting current and future BEV map segmentations, particularly in scenarios involving communication failures. We show the efficacy of TempCoBEV and its capability to integrate historical cues into the current BEV map, improving predictions under optimal communication conditions by up to 2% and under communication failures by up to 19%. The code will be published on GitHub.
LGApr 19, 2021
Estimating Traffic Speeds using Probe Data: A Deep Neural Network ApproachFelix Rempe, Philipp Franeck, Klaus Bogenberger
This paper presents a dedicated Deep Neural Network (DNN) architecture that reconstructs space-time traffic speeds on freeways given sparse data. The DNN is constructed in such a way, that it learns heterogeneous congestion patterns using a large dataset of sparse speed data, in particular from probe vehicles. Input to the DNN are two equally sized input matrices: one containing raw measurement data, and the other indicates the cells occupied with data. The DNN, comprising multiple stacked convolutional layers with an encoding-decoding structure and feed-forward paths, transforms the input into a full matrix of traffic speeds. The proposed DNN architecture is evaluated with respect to its ability to accurately reconstruct heterogeneous congestion patterns under varying input data sparsity. Therefore, a large set of empirical Floating-Car Data (FCD) collected on German freeway A9 during two months is utilized. In total, 43 congestion distinct scenarios are observed which comprise moving and stationary congestion patterns. A data augmentation technique is applied to generate input-output samples of the data, which makes the DNN shift-invariant as well as capable of managing varying data sparsities. The DNN is trained and subsequently applied to sparse data of an unseen congestion scenario. The results show that the DNN is able to apply learned patterns, and reconstructs moving as well as stationary congested traffic with high accuracy; even given highly sparse input data. Reconstructed speeds are compared qualitatively and quantitatively with the results of several state-of-the-art methods such as the Adaptive Smoothing Method (ASM), the Phase-Based Smoothing Method (PSM) and a standard Convolutional Neural Network (CNN) architecture. As a result, the DNN outperforms the other methods significantly.
SOC-PHSep 17, 2020
Feature Engineering for Data-driven Traffic State Forecast in Urban Road NetworksFelix Rempe, Klaus Bogenberger
Most traffic state forecast algorithms when applied to urban road networks consider only the links in close proximity to the target location. However, for longer-term forecasts also the traffic state of more distant links or regions of the network are expected to provide valuable information for a data-driven algorithm. This paper studies these expectations of using a network clustering algorithm and one year of Floating Car (FCD) collected by a large fleet of vehicles. First, a clustering algorithm is applied to the data in order to extract congestion-prone regions in the Munich city network. The level of congestion inside these clusters is analyzed with the help of statistical tools. Clear spatio-temporal congestion patterns and correlations between the clustered regions are identified. These correlations are integrated into a K- Nearest Neighbors (KNN) travel time prediction algorithm. In a comparison with other approaches, this method achieves the best results. The statistical results and the performance of the KNN predictor indicate that the consideration of the network-wide traffic is a valuable feature for predictors and a promising way to develop more accurate algorithms in the future.