Martin Lumiste

h-index1

3papers

6citations

3 Papers

6.6LGMar 14, 2023Code

Traffic4cast at NeurIPS 2022 -- Predict Dynamics along Graph Edges from Sparse Node Data: Whole City Traffic and ETA from Stationary Vehicle Detectors

Moritz Neun, Christian Eichenberger, Henry Martin et al.

The global trends of urbanization and increased personal mobility force us to rethink the way we live and use urban space. The Traffic4cast competition series tackles this problem in a data-driven way, advancing the latest methods in machine learning for modeling complex spatial systems over time. In this edition, our dynamic road graph data combine information from road maps, $10^{12}$ probe data points, and stationary vehicle detectors in three cities over the span of two years. While stationary vehicle detectors are the most accurate way to capture traffic volume, they are only available in few locations. Traffic4cast 2022 explores models that have the ability to generalize loosely related temporal vertex data on just a few nodes to predict dynamic future traffic states on the edges of the entire road graph. In the core challenge, participants are invited to predict the likelihoods of three congestion classes derived from the speed levels in the GPS data for the entire road graph in three cities 15 min into the future. We only provide vehicle count data from spatially sparse stationary vehicle detectors in these three cities as model input for this task. The data are aggregated in 15 min time bins for one hour prior to the prediction time. For the extended challenge, participants are tasked to predict the average travel times on super-segments 15 min into the future - super-segments are longer sequences of road segments in the graph. The competition results provide an important advance in the prediction of complex city-wide traffic states just from publicly available sparse vehicle data and without the need for large amounts of real-time floating vehicle data.

4.6LGOct 31, 2022Code

Large scale traffic forecasting with gradient boosting, Traffic4cast 2022 challenge

Martin Lumiste, Andrei Ilie

Accurate traffic forecasting is of the utmost importance for optimal travel planning and for efficient city mobility. IARAI (The Institute of Advanced Research in Artificial Intelligence) organizes Traffic4cast, a yearly traffic prediction competition based on real-life data [https://www.iarai.ac.at/traffic4cast/], aiming to leverage artificial intelligence advances for producing accurate traffic estimates. We present our solution to the IARAI Traffic4cast 2022 competition, in which the goal is to develop algorithms for predicting road graph edge congestion classes and supersegment-level travel times. In contrast to the previous years, this year's competition focuses on modelling graph edge level behaviour, rather than more coarse aggregated grid-based traffic movies. Due to this, we leverage a method familiar from tabular data modelling -- gradient-boosted decision tree ensembles. We reduce the dimensionality of the input data representing traffic counters with the help of the classic PCA method and feed it as input to a LightGBM model. This simple, fast, and scalable technique allowed us to win second place in the core competition. The source code and references to trained model files and submissions are available at https://github.com/skandium/t4c22 .

IVJun 26

MLVC: Multi-platform Learned Video Codec for Real-World Deployment

Tanel Pärnamaa, Martin Lumiste, Ardi Loot et al.

Neural video codecs have surpassed classical codecs in coding efficiency but remain impractical for deployment due to cross-platform incompatibility and high computational cost. Existing quantization-based solutions fail to produce deterministic results across diverse hardware platforms, leading to catastrophic decoding failures. We introduce MLVC, a hardware-robust neural video codec designed for practical cross-platform inference. The key idea is to explicitly transmit scale parameters through the hyperprior, which guarantees entropy coding consistency across devices without requiring bit-exact arithmetic. While this increases bitrate overhead, we recover most of the coding efficiency through architectural improvements (gated memory, ReGLU activation), a long-term reference recovery mechanism, and domain-specific perceptual training. On the VCD video conferencing benchmark, MLVC achieves >70% BD-rate (MOS) improvement over hardware HEVC, the strongest deployable baseline, while reaching subjective quality competitive with DCVC-RT, which cannot operate across diverse platforms. Both the encoder and decoder run at 100 FPS on average on commodity NPUs from Apple, Intel, and Qualcomm. MLVC is the first neural video codec to combine competitive compression performance, real-time speed, and cross-platform robustness across diverse consumer devices, making it suitable for widespread deployment. Code will be released.