Wolfgang Utschick

LG
h-index40
27papers
204citations
Novelty47%
AI Score54

27 Papers

CVJul 18, 2022Code
ExAgt: Expert-guided Augmentation for Representation Learning of Traffic Scenarios

Lakshman Balasubramanian, Jonas Wurst, Robin Egolf et al.

Representation learning in recent years has been addressed with self-supervised learning methods. The input data is augmented into two distorted views and an encoder learns the representations that are invariant to distortions -- cross-view prediction. Augmentation is one of the key components in cross-view self-supervised learning frameworks to learn visual representations. This paper presents ExAgt, a novel method to include expert knowledge for augmenting traffic scenarios, to improve the learnt representations without any human annotation. The expert-guided augmentations are generated in an automated fashion based on the infrastructure, the interactions between the EGO and the traffic participants and an ideal sensor model. The ExAgt method is applied in two state-of-the-art cross-view prediction methods and the representations learnt are tested in downstream tasks like classification and clustering. Results show that the ExAgt method improves representation learning compared to using only standard augmentations and it provides a better representation space stability. The code is available at https://github.com/lab176344/ExAgt.

ITMay 5
Is Lattice Reduction Necessary for Vector Perturbation Precoding?

Dominik Semmler, Wolfgang Utschick, Michael Joham

Vector perturbation (VP) precoding is an effective nonlinear precoding technique in the downlink (DL) with modulo channels, providing an approximation of dirty paper coding (DPC) which is capacity-achieving. Especially, when combined with Lattice reduction (LR), low-complexity algorithms achieve a very promising performance, outperforming other popular non-linear precoding techniques like Tomlinson-Harashima precoding (THP). However, these results are based on the symbol error rate (SER) or bit error rate (BER). When shifting the focus to the mutual information as the figure of merit, we show that this is different and that the underlying lattice problem has a unique structural property. For lattice problems with this special structure, we show for a whole class of algorithms that LR does not have any impact on the solution vector. At the same time, algorithms are identified which benefit from LR, even if this lattice structure arises. The provided structural analysis has strong implications on the performance evaluation of VP. In particular, we re-evaluate popular Lenstra-Lenstra-Lovász (LLL)-aided methods like the LLL-aided nearest plane (NP) algorithm and show that they do not outperform conventional THP, highlighting the effectiveness of the THP method. This is in contrast to the existing results based on SER and BER where these methods clearly outperform THP.

MLFeb 1, 2023
Reverse Ordering Techniques for Attention-Based Channel Prediction

Valentina Rizzello, Benedikt Böck, Michael Joham et al.

This work aims to predict channels in wireless communication systems based on noisy observations, utilizing sequence-to-sequence models with attention (Seq2Seq-attn) and transformer models. Both models are adapted from natural language processing to tackle the complex challenge of channel prediction. Additionally, a new technique called reverse positional encoding is introduced in the transformer model to improve the robustness of the model against varying sequence lengths. Similarly, the encoder outputs of the Seq2Seq-attn model are reversed before applying attention. Simulation results demonstrate that the proposed ordering techniques allow the models to better capture the relationships between the channel snapshots within the sequence, irrespective of the sequence length, as opposed to existing methods.

CVJul 19, 2022
Expert-LaSTS: Expert-Knowledge Guided Latent Space for Traffic Scenarios

Jonas Wurst, Lakshman Balasubramanian, Michael Botsch et al.

Clustering traffic scenarios and detecting novel scenario types are required for scenario-based testing of autonomous vehicles. These tasks benefit from either good similarity measures or good representations for the traffic scenarios. In this work, an expert-knowledge aided representation learning for traffic scenarios is presented. The latent space so formed is used for successful clustering and novel scenario type detection. Expert-knowledge is used to define objectives that the latent representations of traffic scenarios shall fulfill. It is presented, how the network architecture and loss is designed from these objectives, thereby incorporating expert-knowledge. An automatic mining strategy for traffic scenarios is presented, such that no manual labeling is required. Results show the performance advantage compared to baseline methods. Additionally, extensive analysis of the latent space is performed.

SPJul 13, 2022
Learning Representations for CSI Adaptive Quantization and Feedback

Valentina Rizzello, Matteo Nerini, Michael Joham et al.

In this work, we propose an efficient method for channel state information (CSI) adaptive quantization and feedback in frequency division duplexing (FDD) systems. Existing works mainly focus on the implementation of autoencoder (AE) neural networks (NNs) for CSI compression, and consider straightforward quantization methods, e.g., uniform quantization, which are generally not optimal. With this strategy, it is hard to achieve a low reconstruction error, especially, when the available number of bits reserved for the latent space quantization is small. To address this issue, we recommend two different methods: one based on a post training quantization and the second one in which the codebook is found during the training of the AE. Both strategies achieve better reconstruction accuracy compared to standard quantization techniques.

LGApr 21, 2023
Gradient Derivation for Learnable Parameters in Graph Attention Networks

Marion Neumeier, Andreas Tollkühn, Sebastian Dorn et al.

This work provides a comprehensive derivation of the parameter gradients for GATv2 [4], a widely used implementation of Graph Attention Networks (GATs). GATs have proven to be powerful frameworks for processing graph-structured data and, hence, have been used in a range of applications. However, the achieved performance by these attempts has been found to be inconsistent across different datasets and the reasons for this remains an open research question. As the gradient flow provides valuable insights into the training dynamics of statistically learning models, this work obtains the gradients for the trainable model parameters of GATv2. The gradient derivations supplement the efforts of [2], where potential pitfalls of GATv2 are investigated.

AIAug 16, 2023
Prediction and Interpretation of Vehicle Trajectories in the Graph Spectral Domain

Marion Neumeier, Sebastian Dorn, Michael Botsch et al.

This work provides a comprehensive analysis and interpretation of the graph spectral representation of traffic scenarios. Based on a spatio-temporal vehicle interaction graph, an observed traffic scenario can be transformed into the graph spectral domain by means of the multidimensional Graph Fourier Transformation. Since these spectral scenario representations have shown to successfully incorporate the complex and interactive nature of traffic scenarios, the beneficial feature representation is employed for the purpose of predicting vehicle trajectories. This work introduces GFTNNv2, a deep learning network predicting vehicle trajectories in the graph spectral domain. Evaluation of the GFTNNv2 on the publicly available datasets highD and NGSIM shows a performance gain of up to 25% in comparison to state-of-the-art prediction approaches.

CVMar 17
Behavior-Centric Extraction of Scenarios from Highway Traffic Data and their Domain-Knowledge-Guided Clustering using CVQ-VAE

Niklas Roßberg, Sinan Hasirlioglu, Mohamed Essayed Bouzouraa et al.

Approval of ADS depends on evaluating its behavior within representative real-world traffic scenarios. A common way to obtain such scenarios is to extract them from real-world data recordings. These can then be grouped and serve as basis on which the ADS is subsequently tested. This poses two central challenges: how scenarios are extracted and how they are grouped. Existing extraction methods rely on heterogeneous definitions, hindering scenario comparability. For the grouping of scenarios, rule-based or ML-based methods can be utilized. However, while modern ML-based approaches can handle the complexity of traffic scenarios, unlike rule-based approaches, they lack interpretability and may not align with domain-knowledge. This work contributes to a standardized scenario extraction based on the Scenario-as-Specification concept, as well as a domain-knowledge-guided scenario clustering process. Experiments on the highD dataset demonstrate that scenarios can be extracted reliably and that domain-knowledge can be effectively integrated into the clustering process. As a result, the proposed methodology supports a more standardized process for deriving scenario categories from highway data recordings and thus enables a more efficient validation process of automated vehicles.

LGFeb 24
Uncertainty-Aware Diffusion Model for Multimodal Highway Trajectory Prediction via DDIM Sampling

Marion Neumeier, Niklas Roßberg, Michael Botsch et al.

Accurate and uncertainty-aware trajectory prediction remains a core challenge for autonomous driving, driven by complex multi-agent interactions, diverse scene contexts and the inherently stochastic nature of future motion. Diffusion-based generative models have recently shown strong potential for capturing multimodal futures, yet existing approaches such as cVMD suffer from slow sampling, limited exploitation of generative diversity and brittle scenario encodings. This work introduces cVMDx, an enhanced diffusion-based trajectory prediction framework that improves efficiency, robustness and multimodal predictive capability. Through DDIM sampling, cVMDx achieves up to a 100x reduction in inference time, enabling practical multi-sample generation for uncertainty estimation. A fitted Gaussian Mixture Model further provides tractable multimodal predictions from the generated trajectories. In addition, a CVQ-VAE variant is evaluated for scenario encoding. Experiments on the publicly available highD dataset show that cVMDx achieves higher accuracy and significantly improved efficiency over cVMD, enabling fully stochastic, multimodal trajectory prediction.

IVNov 24, 2023
Unsupervised high-throughput segmentation of cells and cell nuclei in quantitative phase images

Julia Sistermanns, Ellen Emken, Gregor Weirich et al.

In the effort to aid cytologic diagnostics by establishing automatic single cell screening using high throughput digital holographic microscopy for clinical studies thousands of images and millions of cells are captured. The bottleneck lies in an automatic, fast, and unsupervised segmentation technique that does not limit the types of cells which might occur. We propose an unsupervised multistage method that segments correctly without confusing noise or reflections with cells and without missing cells that also includes the detection of relevant inner structures, especially the cell nucleus in the unstained cell. In an effort to make the information reasonable and interpretable for cytopathologists, we also introduce new cytoplasmic and nuclear features of potential help for cytologic diagnoses which exploit the quantitative phase information inherent to the measurement scheme. We show that the segmentation provides consistently good results over many experiments on patient samples in a reasonable per cell analysis time.

SPMar 6, 2024
Diffusion-based Generative Prior for Low-Complexity MIMO Channel Estimation

Benedikt Fesl, Michael Baur, Florian Strasser et al.

This work proposes a novel channel estimator based on diffusion models (DMs), one of the currently top-rated generative models. Contrary to related works utilizing generative priors, a lightweight convolutional neural network (CNN) with positional embedding of the signal-to-noise ratio (SNR) information is designed by learning the channel distribution in the sparse angular domain. Combined with an estimation strategy that avoids stochastic resampling and truncates reverse diffusion steps that account for lower SNR than the given pilot observation, the resulting DM estimator has both low complexity and memory overhead. Numerical results exhibit better performance than state-of-the-art channel estimators utilizing generative priors.

LGMay 23, 2024
Reliable Trajectory Prediction and Uncertainty Quantification with Conditioned Diffusion Models

Marion Neumeier, Sebastian Dorn, Michael Botsch et al.

This work introduces the conditioned Vehicle Motion Diffusion (cVMD) model, a novel network architecture for highway trajectory prediction using diffusion models. The proposed model ensures the drivability of the predicted trajectory by integrating non-holonomic motion constraints and physical constraints into the generative prediction module. Central to the architecture of cVMD is its capacity to perform uncertainty quantification, a feature that is crucial in safety-critical applications. By integrating the quantified uncertainty into the prediction process, the cVMD's trajectory prediction performance is improved considerably. The model's performance was evaluated using the publicly available highD dataset. Experiments show that the proposed architecture achieves competitive trajectory prediction accuracy compared to state-of-the-art models, while providing guaranteed drivable trajectories and uncertainty quantification.

LGMar 5, 2024
On the Asymptotic Mean Square Error Optimality of Diffusion Models

Benedikt Fesl, Benedikt Böck, Florian Strasser et al.

Diffusion models (DMs) as generative priors have recently shown great potential for denoising tasks but lack theoretical understanding with respect to their mean square error (MSE) optimality. This paper proposes a novel denoising strategy inspired by the structure of the MSE-optimal conditional mean estimator (CME). The resulting DM-based denoiser can be conveniently employed using a pre-trained DM, being particularly fast by truncating reverse diffusion steps and not requiring stochastic re-sampling. We present a comprehensive (non-)asymptotic optimality analysis of the proposed diffusion-based denoiser, demonstrating polynomial-time convergence to the CME under mild conditions. Our analysis also derives a novel Lipschitz constant that depends solely on the DM's hyperparameters. Further, we offer a new perspective on DMs, showing that they inherently combine an asymptotically optimal denoiser with a powerful generator, modifiable by switching re-sampling in the reverse process on or off. The theoretical findings are thoroughly validated with experiments based on various benchmark datasets

MLNov 14, 2024
Sparse Bayesian Generative Modeling for Compressive Sensing

Benedikt Böck, Sadaf Syed, Wolfgang Utschick

This work addresses the fundamental linear inverse problem in compressive sensing (CS) by introducing a new type of regularizing generative prior. Our proposed method utilizes ideas from classical dictionary-based CS and, in particular, sparse Bayesian learning (SBL), to integrate a strong regularization towards sparse solutions. At the same time, by leveraging the notion of conditional Gaussianity, it also incorporates the adaptability from generative models to training data. However, unlike most state-of-the-art generative models, it is able to learn from a few compressed and noisy data samples and requires no optimization algorithm for solving the inverse problem. Additionally, similar to Dirichlet prior networks, our model parameterizes a conjugate prior enabling its application for uncertainty quantification. We support our approach theoretically through the concept of variational inference and validate it empirically using different types of compressible signals.

CVOct 27, 2025
UrbanIng-V2X: A Large-Scale Multi-Vehicle, Multi-Infrastructure Dataset Across Multiple Intersections for Cooperative Perception

Karthikeyan Chandra Sekaran, Markus Geisler, Dominik Rößle et al.

Recent cooperative perception datasets have played a crucial role in advancing smart mobility applications by enabling information exchange between intelligent agents, helping to overcome challenges such as occlusions and improving overall scene understanding. While some existing real-world datasets incorporate both vehicle-to-vehicle and vehicle-to-infrastructure interactions, they are typically limited to a single intersection or a single vehicle. A comprehensive perception dataset featuring multiple connected vehicles and infrastructure sensors across several intersections remains unavailable, limiting the benchmarking of algorithms in diverse traffic environments. Consequently, overfitting can occur, and models may demonstrate misleadingly high performance due to similar intersection layouts and traffic participant behavior. To address this gap, we introduce UrbanIng-V2X, the first large-scale, multi-modal dataset supporting cooperative perception involving vehicles and infrastructure sensors deployed across three urban intersections in Ingolstadt, Germany. UrbanIng-V2X consists of 34 temporally aligned and spatially calibrated sensor sequences, each lasting 20 seconds. All sequences contain recordings from one of three intersections, involving two vehicles and up to three infrastructure-mounted sensor poles operating in coordinated scenarios. In total, UrbanIng-V2X provides data from 12 vehicle-mounted RGB cameras, 2 vehicle LiDARs, 17 infrastructure thermal cameras, and 12 infrastructure LiDARs. All sequences are annotated at a frequency of 10 Hz with 3D bounding boxes spanning 13 object classes, resulting in approximately 712k annotated instances across the dataset. We provide comprehensive evaluations using state-of-the-art cooperative perception methods and publicly release the codebase, dataset, HD map, and a digital twin of the complete data collection environment.

ITOct 10, 2025
Precoder Design in Multi-User FDD Systems with VQ-VAE and GNN

Srikar Allaparapu, Michael Baur, Benedikt Böck et al.

Robust precoding is efficiently feasible in frequency division duplex (FDD) systems by incorporating the learnt statistics of the propagation environment through a generative model. We build on previous work that successfully designed site-specific precoders based on a combination of Gaussian mixture models (GMMs) and graph neural networks (GNNs). In this paper, by utilizing a vector quantized-variational autoencoder (VQ-VAE), we circumvent one of the key drawbacks of GMMs, i.e., the number of GMM components scales exponentially to the feedback bits. In addition, the deep learning architecture of the VQ-VAE allows us to jointly train the GNN together with VQ-VAE along with pilot optimization forming an end-to-end (E2E) model, resulting in considerable performance gains in sum rate for multi-user wireless systems. Simulations demonstrate the superiority of the proposed frameworks over the conventional methods involving the sub-discrete Fourier transform (DFT) pilot matrix and iterative precoder algorithms enabling the deployment of systems characterized by fewer pilots or feedback bits.

ROApr 8, 2025
Uncertainty-Aware Hybrid Machine Learning in Virtual Sensors for Vehicle Sideslip Angle Estimation

Abinav Kalyanasundaram, Karthikeyan Chandra Sekaran, Philipp Stauber et al.

Precise vehicle state estimation is crucial for safe and reliable autonomous driving. The number of measurable states and their precision offered by the onboard vehicle sensor system are often constrained by cost. For instance, measuring critical quantities such as the Vehicle Sideslip Angle (VSA) poses significant commercial challenges using current optical sensors. This paper addresses these limitations by focusing on the development of high-performance virtual sensors to enhance vehicle state estimation for active safety. The proposed Uncertainty-Aware Hybrid Learning (UAHL) architecture integrates a machine learning model with vehicle motion models to estimate VSA directly from onboard sensor data. A key aspect of the UAHL architecture is its focus on uncertainty quantification for individual model estimates and hybrid fusion. These mechanisms enable the dynamic weighting of uncertainty-aware predictions from machine learning and vehicle motion models to produce accurate and reliable hybrid VSA estimates. This work also presents a novel dataset named Real-world Vehicle State Estimation Dataset (ReV-StED), comprising synchronized measurements from advanced vehicle dynamic sensors. The experimental results demonstrate the superior performance of the proposed method for VSA estimation, highlighting UAHL as a promising architecture for advancing virtual sensors and enhancing active safety in autonomous vehicles.

SPApr 26, 2024
Enhancing Channel Estimation in Quantized Systems with a Generative Prior

Benedikt Fesl, Aziz Banna, Wolfgang Utschick

Channel estimation in quantized systems is challenging, particularly in low-resolution systems. In this work, we propose to leverage a Gaussian mixture model (GMM) as generative prior, capturing the channel distribution of the propagation environment, to enhance a classical estimation technique based on the expectation-maximization (EM) algorithm for one-bit quantization. Thereby, a maximum a posteriori (MAP) estimate of the most responsible mixture component is inferred for a quantized received signal, which is subsequently utilized in the EM algorithm as side information. Numerical results demonstrate the significant performance improvement of our proposed approach over both a simplistic Gaussian prior and current state-of-the-art channel estimators. Furthermore, the proposed estimation framework exhibits adaptability to higher resolution systems and alternative generative priors.

LGMay 25, 2023
Optimization and Interpretability of Graph Attention Networks for Small Sparse Graph Structures in Automotive Applications

Marion Neumeier, Andreas Tollkühn, Sebastian Dorn et al.

For automotive applications, the Graph Attention Network (GAT) is a prominently used architecture to include relational information of a traffic scenario during feature embedding. As shown in this work, however, one of the most popular GAT realizations, namely GATv2, has potential pitfalls that hinder an optimal parameter learning. Especially for small and sparse graph structures a proper optimization is problematic. To surpass limitations, this work proposes architectural modifications of GATv2. In controlled experiments, it is shown that the proposed model adaptions improve prediction performance in a node-level regression task and make it more robust to parameter initialization. This work aims for a better understanding of the attention mechanism and analyzes its interpretability of identifying causal importance.

LGMay 22, 2023
On Learning the Tail Quantiles of Driving Behavior Distributions via Quantile Regression and Flows

Jia Yu Tee, Oliver De Candido, Wolfgang Utschick et al.

Towards safe autonomous driving (AD), we consider the problem of learning models that accurately capture the diversity and tail quantiles of human driver behavior probability distributions, in interaction with an AD vehicle. Such models, which predict drivers' continuous actions from their states, are particularly relevant for closing the gap between AD agent simulations and reality. To this end, we adapt two flexible quantile learning frameworks for this setting that avoid strong distributional assumptions: (1) quantile regression (based on the titled absolute loss), and (2) autoregressive quantile flows (a version of normalizing flows). Training happens in a behavior cloning-fashion. We use the highD dataset consisting of driver trajectories on several highways. We evaluate our approach in a one-step acceleration prediction task, and in multi-step driver simulation rollouts. We report quantitative results using the tilted absolute loss as metric, give qualitative examples showing that realistic extremal behavior can be learned, and discuss the main insights.

LGMay 12, 2023
A Multidimensional Graph Fourier Transformation Neural Network for Vehicle Trajectory Prediction

Marion Neumeier, Andreas Tollkühn, Michael Botsch et al.

This work introduces the multidimensional Graph Fourier Transformation Neural Network (GFTNN) for long-term trajectory predictions on highways. Similar to Graph Neural Networks (GNNs), the GFTNN is a novel network architecture that operates on graph structures. While several GNNs lack discriminative power due to suboptimal aggregation schemes, the proposed model aggregates scenario properties through a powerful operation: the multidimensional Graph Fourier Transformation (GFT). The spatio-temporal vehicle interaction graph of a scenario is converted into a spectral scenario representation using the GFT. This beneficial representation is input to the prediction framework composed of a neural network and a descriptive decoder. Even though the proposed GFTNN does not include any recurrent element, it outperforms state-of-the-art models in the task of highway trajectory prediction. For experiments and evaluation, the publicly available datasets highD and NGSIM are used

SPNov 18, 2021
CSI Clustering with Variational Autoencoding

Michael Baur, Michael Würth, Michael Koller et al.

The model order of a wireless channel plays an important role for a variety of applications in communications engineering, e.g., it represents the number of resolvable incident wavefronts with non-negligible power incident from a transmitter to a receiver. Areas such as direction of arrival estimation leverage the model order to analyze the multipath components of channel state information. In this work, we propose to use a variational autoencoder to group unlabeled channel state information with respect to the model order in the variational autoencoder latent space in an unsupervised manner. We validate our approach with simulated 3GPP channel data. Our results suggest that, in order to learn an appropriate clustering, it is crucial to use a more flexible likelihood model for the variational autoencoder decoder than it is usually the case in standard applications.

SPOct 14, 2021
Learning a Compressive Sensing Matrix with Structural Constraints via Maximum Mean Discrepancy Optimization

Michael Koller, Wolfgang Utschick

We introduce a learning-based algorithm to obtain a measurement matrix for compressive sensing related recovery problems. The focus lies on matrices with a constant modulus constraint which typically represent a network of analog phase shifters in hybrid precoding/combining architectures. We interpret a matrix with restricted isometry property as a mapping of points from a high- to a low-dimensional hypersphere. We argue that points on the low-dimensional hypersphere, namely, in the range of the matrix, should be uniformly distributed to increase robustness against measurement noise. This notion is formalized in an optimization problem which uses one of the maximum mean discrepancy metrics in the objective function. Recent success of such metrics in neural network related topics motivate a solution of the problem based on machine learning. Numerical experiments show better performance than random measurement matrices that are generally employed in compressive sensing contexts. Further, we adapt a method from the literature to the constant modulus constraint. This method can also compete with random matrices and it is shown to harmonize well with the proposed learning-based approach if it is used as an initialization. Lastly, we describe how other structural matrix constraints, e.g., a Toeplitz constraint, can be taken into account, too.

LGMay 27, 2021
Classification and Uncertainty Quantification of Corrupted Data using Semi-Supervised Autoencoders

Philipp Joppich, Sebastian Dorn, Oliver De Candido et al.

Parametric and non-parametric classifiers often have to deal with real-world data, where corruptions like noise, occlusions, and blur are unavoidable - posing significant challenges. We present a probabilistic approach to classify strongly corrupted data and quantify uncertainty, despite the model only having been trained with uncorrupted data. A semi-supervised autoencoder trained on uncorrupted data is the underlying architecture. We use the decoding part as a generative model for realistic data and extend it by convolutions, masking, and additive Gaussian noise to describe imperfections. This constitutes a statistical inference task in terms of the optimal latent space activations of the underlying uncorrupted datum. We solve this problem approximately with Metric Gaussian Variational Inference (MGVI). The supervision of the autoencoder's latent space allows us to classify corrupted data directly under uncertainty with the statistically inferred latent space activations. Furthermore, we demonstrate that the model uncertainty strongly depends on whether the classification is correct or wrong, setting a basis for a statistical "lie detector" of the classification. Independent of that, we show that the generative model can optimally restore the uncorrupted datum by decoding the inferred latent space activations.

CVMay 5, 2021
Novelty Detection and Analysis of Traffic Scenario Infrastructures in the Latent Space of a Vision Transformer-Based Triplet Autoencoder

Jonas Wurst, Lakshman Balasubramanian, Michael Botsch et al.

Detecting unknown and untested scenarios is crucial for scenario-based testing. Scenario-based testing is considered to be a possible approach to validate autonomous vehicles. A traffic scenario consists of multiple components, with infrastructure being one of it. In this work, a method to detect novel traffic scenarios based on their infrastructure images is presented. An autoencoder triplet network provides latent representations for infrastructure images which are used for outlier detection. The triplet training of the network is based on the connectivity graphs of the infrastructure. By using the proposed architecture, expert-knowledge is used to shape the latent space such that it incorporates a pre-defined similarity in the neighborhood relationships of an autoencoder. An ablation study on the architecture is highlighting the importance of the triplet autoencoder combination. The best performing architecture is based on vision transformers, a convolution-free attention-based network. The presented method outperforms other state-of-the-art outlier detection approaches.

LGMay 27, 2020
An Entropy Based Outlier Score and its Application to Novelty Detection for Road Infrastructure Images

Jonas Wurst, Alberto Flores Fernández, Michael Botsch et al.

A novel unsupervised outlier score, which can be embedded into graph based dimensionality reduction techniques, is presented in this work. The score uses the directed nearest neighbor graphs of those techniques. Hence, the same measure of similarity that is used to project the data into lower dimensions, is also utilized to determine the outlier score. The outlier score is realized through a weighted normalized entropy of the similarities. This score is applied to road infrastructure images. The aim is to identify newly observed infrastructures given a pre-collected base dataset. Detecting unknown scenarios is a key for accelerated validation of autonomous vehicles. The results show the high potential of the proposed technique. To validate the generalization capabilities of the outlier score, it is additionally applied to various real world datasets. The overall average performance in identifying outliers using the proposed methods is higher compared to state-of-the-art methods. In order to generate the infrastructure images, an openDRIVE parsing and plotting tool for Matlab is developed as part of this work. This tool and the implementation of the entropy based outlier score in combination with Uniform Manifold Approximation and Projection are made publicly available.

SPOct 21, 2019
Model Order Selection in DoA Scenarios via Cross-Entropy based Machine Learning Techniques

Andreas Barthelme, Reinhard Wiesmayr, Wolfgang Utschick

In this paper, we present a machine learning approach for estimating the number of incident wavefronts in a direction of arrival scenario. In contrast to previous works, a multilayer neural network with a cross-entropy objective is trained. Furthermore, we investigate an online training procedure that allows an adaption of the neural network to imperfections of an antenna array without explicitly calibrating the array manifold. We show via simulations that the proposed method outperforms classical model order selection schemes based on information criteria in terms of accuracy, especially for a small number of snapshots and at low signal-to-noise-ratios. Also, the online training procedure enables the neural network to adapt with only a few online training samples, if initialized by offline training on artificial data.