Karsten Kahl

CV
h-index21
18papers
172citations
Novelty48%
AI Score39

18 Papers

CVMar 13, 2023
Identifying Label Errors in Object Detection Datasets by Loss Inspection

Marius Schubert, Tobias Riedlinger, Karsten Kahl et al.

Labeling datasets for supervised object detection is a dull and time-consuming task. Errors can be easily introduced during annotation and overlooked during review, yielding inaccurate benchmarks and performance degradation of deep neural networks trained on noisy labels. In this work, we for the first time introduce a benchmark for label error detection methods on object detection datasets as well as a label error detection method and a number of baselines. We simulate four different types of randomly introduced label errors on train and test sets of well-labeled object detection datasets. For our label error detection method we assume a two-stage object detector to be given and consider the sum of both stages' classification and regression losses. The losses are computed with respect to the predictions and the noisy labels including simulated label errors, aiming at detecting the latter. We compare our method to three baselines: a naive one without deep learning, the object detector's score and the entropy of the classification softmax distribution. We outperform all baselines and demonstrate that among the considered methods, ours is the only one that detects label errors of all four types efficiently. Furthermore, we detect real label errors a) on commonly used test datasets in object detection and b) on a proprietary dataset. In both cases we achieve low false positives rates, i.e., we detect label errors with a precision for a) of up to 71.5% and for b) with 97%.

NAMar 29, 2017
Optimal interpolation and Compatible Relaxation in Classical Algebraic Multigrid

James Brannick, Fei Cao, Karsten Kahl et al.

In this paper, we consider a classical form of optimal algebraic multigrid (AMG) interpolation that directly minimizes the two-grid convergence rate and compare it with the so-called ideal form that minimizes a certain weak approximation property of the coarse space. We study compatible relaxation type estimates for the quality of the coarse grid and derive a new sharp measure using optimal interpolation that provides a guaranteed lower bound on the convergence rate of the resulting two-grid method for a given grid. In addition, we design a generalized bootstrap algebraic multigrid setup algorithm that computes a sparse approximation to the optimal interpolation matrix. We demonstrate numerically that the BAMG method with sparse interpolation matrix (and spanning multiple levels) outperforms the two-grid method with the standard ideal interpolation (a dense matrix) for various scalar diffusion problems with highly varying diffusion coefficient.

NAJun 29, 2011
An algebraic distances measure of AMG strength of connection

Achi Brandt, James Brannick, Karsten Kahl et al.

Algebraic multigrid is an iterative method that is often optimal for solving the matrix equations that arise in a wide variety of applications, including discretized partial differential equations. It automatically constructs a sequence of increasingly smaller matrix problems that enable efficient resolution of all scales present in the solution. One of the main components of the method is an adequate choice of coarse grids. The current coarsening methodology is based on measuring how a so-called algebraically smooth error value at one point depends on the error values at its neighbors. Such a concept of strength of connection is well understood for operators whose principal part is an M-matrix; however, the strength concept for more general matrices is not yet clearly understood, and this lack of knowledge limits the scope of AMG applicability. The purpose of this paper is to motivate a general definition of strength of connection, based on the notion of algebraic distances, discuss its implementation, and present the results of initial numerical experiments. The algebraic distance measure, we propose, uses as its main tool a least squares functional, which is also applied to define interpolation.

STR-ELNov 19, 2018
Schur complement solver for Quantum Monte-Carlo simulations of strongly interacting fermions

Maksim Ulybyshev, Nils Kintscher, Karsten Kahl et al.

We present a non-iterative solver based on the Schur complement method for sparse linear systems of special form which appear in Quantum Monte-Carlo (QMC) simulations of strongly interacting fermions on the lattice. While the number of floating-point operations for this solver scales as the cube of the number of lattice sites, for practically relevant lattice sizes it is still significantly faster than iterative solvers such as the Conjugate Gradient method in the regime of strong inter-fermion interactions, for example, in the vicinity of quantum phase transitions. The speed-up is even more dramatic for the solution of multiple linear systems with different right-hand sides. We present benchmark results for QMC simulations of the tight-binding models on the hexagonal graphene lattice with on-site (Hubbard) and non-local (Coulomb) interactions, and demonstrate the potential for further speed-up using GPU.

NAFeb 2, 2018
Least Angle Regression Coarsening in Bootstrap Algebraic Multigrid

Karsten Kahl, Matthias Rottmann

The bootstrap algebraic multigrid framework allows for the adaptive construction of algebraic multigrid methods in situations where geometric multigrid methods are not known or not available at all. While there has been some work on adaptive coarsening in this framework in terms of algebraic distances, coarsening is the part of the adaptive bootstrap setup that is least developed. In this paper we try to close this gap by introducing an adaptive coarsening scheme that views interpolation as a local regression problem. In fact the bootstrap algebraic multigrid setup can be understood as a machine learning ansatz that learns the nature of smooth error by local regression. In order to turn this idea into a practical method we modify least squares interpolation to both avoid overfitting of the data and to recover a sparse response that can be used to extract information about the coupling strength amongst variables like in classical algebraic multigrid. In order to improve the so-found coarse grid we propose a post-processing to ensure stability of the resulting least squares interpolation operator. We conclude with numerical experiments that show the viability of the chosen approach.

CVSep 30, 2023
Deep Active Learning with Noisy Oracle in Object Detection

Marius Schubert, Tobias Riedlinger, Karsten Kahl et al.

Obtaining annotations for complex computer vision tasks such as object detection is an expensive and time-intense endeavor involving a large number of human workers or expert opinions. Reducing the amount of annotations required while maintaining algorithm performance is, therefore, desirable for machine learning practitioners and has been successfully achieved by active learning algorithms. However, it is not merely the amount of annotations which influences model performance but also the annotation quality. In practice, the oracles that are queried for new annotations frequently contain significant amounts of noise. Therefore, cleansing procedures are oftentimes necessary to review and correct given labels. This process is subject to the same budget as the initial annotation itself since it requires human workers or even domain experts. Here, we propose a composite active learning framework including a label review module for deep object detection. We show that utilizing part of the annotation budget to correct the noisy annotations partially in the active dataset leads to early improvements in model performance, especially when coupled with uncertainty-based query strategies. The precision of the label error proposals has a significant influence on the measured effect of the label review. In our experiments we achieve improvements of up to 4.5 mAP points of object detection performance by incorporating label reviews at equal annotation budget.

CVJun 13, 2023
LMD: Light-weight Prediction Quality Estimation for Object Detection in Lidar Point Clouds

Tobias Riedlinger, Marius Schubert, Sarina Penquitt et al.

Object detection on Lidar point cloud data is a promising technology for autonomous driving and robotics which has seen a significant rise in performance and accuracy during recent years. Particularly uncertainty estimation is a crucial component for down-stream tasks and deep neural networks remain error-prone even for predictions with high confidence. Previously proposed methods for quantifying prediction uncertainty tend to alter the training scheme of the detector or rely on prediction sampling which results in vastly increased inference time. In order to address these two issues, we propose LidarMetaDetect (LMD), a light-weight post-processing scheme for prediction quality estimation. Our method can easily be added to any pre-trained Lidar object detector without altering anything about the base model and is purely based on post-processing, therefore, only leading to a negligible computational overhead. Our experiments show a significant increase of statistical reliability in separating true from false predictions. We propose and evaluate an additional application of our method leading to the detection of annotation errors. Explicit samples and a conservative count of annotation error proposals indicates the viability of our method for large-scale datasets like KITTI and nuScenes. On the widely-used nuScenes test dataset, 43 out of the top 100 proposals of our method indicate, in fact, erroneous annotations.

CVDec 21, 2022
Towards Rapid Prototyping and Comparability in Active Learning for Deep Object Detection

Tobias Riedlinger, Marius Schubert, Karsten Kahl et al.

Active learning as a paradigm in deep learning is especially important in applications involving intricate perception tasks such as object detection where labels are difficult and expensive to acquire. Development of active learning methods in such fields is highly computationally expensive and time consuming which obstructs the progression of research and leads to a lack of comparability between methods. In this work, we propose and investigate a sandbox setup for rapid development and transparent evaluation of active learning in deep object detection. Our experiments with commonly used configurations of datasets and detection architectures found in the literature show that results obtained in our sandbox environment are representative of results on standard configurations. The total compute time to obtain results and assess the learning behavior can thereby be reduced by factors of up to 14 when comparing with Pascal VOC and up to 32 when comparing with BDD100k. This allows for testing and evaluating data acquisition and labeling strategies in under half a day and contributes to the transparency and development speed in the field of active learning for object detection.

CVNov 10, 2022
MGiaD: Multigrid in all dimensions. Efficiency and robustness by coarsening in resolution and channel dimensions

Antonia van Betteray, Matthias Rottmann, Karsten Kahl

Current state-of-the-art deep neural networks for image classification are made up of 10 - 100 million learnable weights and are therefore inherently prone to overfitting. The complexity of the weight count can be seen as a function of the number of channels, the spatial extent of the input and the number of layers of the network. Due to the use of convolutional layers the scaling of weight complexity is usually linear with regards to the resolution dimensions, but remains quadratic with respect to the number of channels. Active research in recent years in terms of using multigrid inspired ideas in deep neural networks have shown that on one hand a significant number of weights can be saved by appropriate weight sharing and on the other that a hierarchical structure in the channel dimension can improve the weight complexity to linear. In this work, we combine these multigrid ideas to introduce a joint framework of multigrid inspired architectures, that exploit multigrid structures in all relevant dimensions to achieve linear weight complexity scaling and drastically reduced weight counts. Our experiments show that this structured reduction in weight count is able to reduce overfitting and thus shows improved performance over state-of-the-art ResNet architectures on typical image classification benchmarks at lower network complexity.

NAMay 20, 2016
Multigrid methods combined with low-rank approximation for tensor structured Markov chains

Matthias Bolten, Karsten Kahl, Daniel Kressner et al.

Markov chains that describe interacting subsystems suffer, on the one hand, from state space explosion but lead, on the other hand, to highly structured matrices. In this work, we propose a novel tensor-based algorithm to address such tensor structured Markov chains. Our algorithm combines a tensorized multigrid method with AMEn, an optimization-based low-rank tensor solver, for addressing coarse grid problems. Numerical experiments demonstrate that this combination overcomes the limitations incurred when using each of the two methods individually. As a consequence, Markov chain models of unprecedented size from a variety of applications can be addressed.

70.6NAMar 27
A Theory of Relaxation-Based Algebraic Multigrid

Rayan Moussa, Karsten Kahl

Algebraic multigrid (AMG) methods derive their optimal efficiency from the interplay between a relaxation process and a corresponding coarse grid correction. In many standard formulations, relaxation and coarse-graining are analyzed and treated as largely separate of one another. Here we propose an alternative theoretical approach centered entirely on the relaxation process, which exposes its fundamental role in the coarse-graining of the fine-scale problem. By treating the relaxation of the error as a dynamical system and applying a dimensional-reduction procedure analogous to the Mori-Zwanzig-Nakajima formalism, we derive exact expressions for the coarse-level equations and the interpolation operations, as well as a natural way of computing complementary transfer operators. We illustrate the unifying nature of this framework by recovering several well-known results for general non-symmetric systems, including ideal and optimal restriction and interpolation, as well as the limiting case of exact elimination. We further emphasize the pivotal importance of compatible-relaxation and identify dynamical corrections that naturally arise in our theory, which have the potential to enhance the convergence, robustness, and adaptivity of future algebraic multigrid methods.

CVFeb 14, 2024
Reducing Texture Bias of Deep Neural Networks via Edge Enhancing Diffusion

Edgar Heinert, Matthias Rottmann, Kira Maag et al.

Convolutional neural networks (CNNs) for image processing tend to focus on localized texture patterns, commonly referred to as texture bias. While most of the previous works in the literature focus on the task of image classification, we go beyond this and study the texture bias of CNNs in semantic segmentation. In this work, we propose to train CNNs on pre-processed images with less texture to reduce the texture bias. Therein, the challenge is to suppress image texture while preserving shape information. To this end, we utilize edge enhancing diffusion (EED), an anisotropic image diffusion method initially introduced for image compression, to create texture reduced duplicates of existing datasets. Extensive numerical studies are performed with both CNNs and vision transformer models trained on original data and EED-processed data from the Cityscapes dataset and the CARLA driving simulator. We observe strong texture-dependence of CNNs and moderate texture-dependence of transformers. Training CNNs on EED-processed images enables the models to become completely ignorant with respect to texture, demonstrating resilience with respect to texture re-introduction to any degree. Additionally we analyze the performance reduction in depth on a level of connected components in the semantic segmentation and study the influence of EED pre-processing on domain generalization as well as adversarial robustness.

LGMar 13, 2025
Poly-MgNet: Polynomial Building Blocks in Multigrid-Inspired ResNets

Antonia van Betteray, Matthias Rottmann, Karsten Kahl

The structural analogies of ResNets and Multigrid (MG) methods such as common building blocks like convolutions and poolings where already pointed out by He et al.\ in 2016. Multigrid methods are used in the context of scientific computing for solving large sparse linear systems arising from partial differential equations. MG methods particularly rely on two main concepts: smoothing and residual restriction / coarsening. Exploiting these analogies, He and Xu developed the MgNet framework, which integrates MG schemes into the design of ResNets. In this work, we introduce a novel neural network building block inspired by polynomial smoothers from MG theory. Our polynomial block from an MG perspective naturally extends the MgNet framework to Poly-Mgnet and at the same time reduces the number of weights in MgNet. We present a comprehensive study of our polynomial block, analyzing the choice of initial coefficients, the polynomial degree, the placement of activation functions, as well as of batch normalizations. Our results demonstrate that constructing (quadratic) polynomial building blocks based on real and imaginary polynomial roots enhances Poly-MgNet's capacity in terms of accuracy. Furthermore, our approach achieves an improved trade-off of model accuracy and number of weights compared to ResNet as well as compared to specific configurations of MgNet.

LGJun 5, 2025
LFA applied to CNNs: Efficient Singular Value Decomposition of Convolutional Mappings by Local Fourier Analysis

Antonia van Betteray, Matthias Rottmann, Karsten Kahl

The singular values of convolutional mappings encode interesting spectral properties, which can be used, e.g., to improve generalization and robustness of convolutional neural networks as well as to facilitate model compression. However, the computation of singular values is typically very resource-intensive. The naive approach involves unrolling the convolutional mapping along the input and channel dimensions into a large and sparse two-dimensional matrix, making the exact calculation of all singular values infeasible due to hardware limitations. In particular, this is true for matrices that represent convolutional mappings with large inputs and a high number of channels. Existing efficient methods leverage the Fast Fourier transformation (FFT) to transform convolutional mappings into the frequency domain, enabling the computation of singular values for matrices representing convolutions with larger input and channel dimensions. For a constant number of channels in a given convolution, an FFT can compute N singular values in O(N log N) complexity. In this work, we propose an approach of complexity O(N) based on local Fourier analysis, which additionally exploits the shift invariance of convolutional operators. We provide a theoretical analysis of our algorithm's runtime and validate its efficiency through numerical experiments. Our results demonstrate that our proposed method is scalable and offers a practical solution to calculate the entire set of singular values - along with the corresponding singular vectors if needed - for high-dimensional convolutional mappings.

CVOct 4, 2020
MetaDetect: Uncertainty Quantification and Prediction Quality Estimates for Object Detection

Marius Schubert, Karsten Kahl, Matthias Rottmann

In object detection with deep neural networks, the box-wise objectness score tends to be overconfident, sometimes even indicating high confidence in presence of inaccurate predictions. Hence, the reliability of the prediction and therefore reliable uncertainties are of highest interest. In this work, we present a post processing method that for any given neural network provides predictive uncertainty estimates and quality estimates. These estimates are learned by a post processing model that receives as input a hand-crafted set of transparent metrics in form of a structured dataset. Therefrom, we learn two tasks for predicted bounding boxes. We discriminate between true positives ($\mathit{IoU}\geq0.5$) and false positives ($\mathit{IoU} < 0.5$) which we term meta classification, and we predict $\mathit{IoU}$ values directly which we term meta regression. The probabilities of the meta classification model aim at learning the probabilities of success and failure and therefore provide a modelled predictive uncertainty estimate. On the other hand, meta regression gives rise to a quality estimate. In numerical experiments, we use the publicly available YOLOv3 network and the Faster-RCNN network and evaluate meta classification and regression performance on the Kitti, Pascal VOC and COCO datasets. We demonstrate that our metrics are indeed well correlated with the $\mathit{IoU}$. For meta classification we obtain classification accuracies of up to 98.92% and AUROCs of up to 99.93%. For meta regression we obtain an $R^2$ value of up to 91.78%. These results yield significant improvements compared to other network's objectness score and other baseline approaches. Therefore, we obtain more reliable uncertainty and quality estimates which is particularly interesting in the absence of ground truth.

LGMar 3, 2018
Deep Bayesian Active Semi-Supervised Learning

Matthias Rottmann, Karsten Kahl, Hanno Gottschalk

In many applications the process of generating label information is expensive and time consuming. We present a new method that combines active and semi-supervised deep learning to achieve high generalization performance from a deep convolutional neural network with as few known labels as possible. In a setting where a small amount of labeled data as well as a large amount of unlabeled data is available, our method first learns the labeled data set. This initialization is followed by an expectation maximization algorithm, where further training reduces classification entropy on the unlabeled data by targeting a low entropy fit which is consistent with the labeled data. In addition the algorithm asks at a specified frequency an oracle for labels of data with entropy above a certain entropy quantile. Using this active learning component we obtain an agile labeling process that achieves high accuracy, but requires only a small amount of known labels. For the MNIST dataset we report an error rate of 2.06% using only 300 labels and 1.06% for 1000 labels. These results are obtained without employing any special network architecture or data augmentation.

NAMay 7, 2015
Multigrid methods for tensor structured Markov chains with low rank approximation

Matthias Bolten, Karsten Kahl, Sonja Sokolović

Tensor structured Markov chains are part of stochastic models of many practical applications, e.g., in the description of complex production or telephone networks. The most interesting question in Markov chain models is the determination of the stationary distribution as a description of the long term behavior of the system. This involves the computation of the eigenvector corresponding to the dominant eigenvalue or equivalently the solution of a singular linear system of equations. Due to the tensor structure of the models the dimension of the operators grows rapidly and a direct solution without exploiting the tensor structure becomes infeasible. Algebraic multigrid methods have proven to be efficient when dealing with Markov chains without using tensor structure. In this work we present an approach to adapt the algebraic multigrid framework to the tensor frame, not only using the tensor structure in matrix-vector multiplications, but also tensor structured coarse-grid operators and tensor representations of the solution vector.

HEP-LATOct 27, 2014
Multigrid Preconditioning for the Overlap Operator in Lattice QCD

James Brannick, Andreas Frommer, Karsten Kahl et al.

The overlap operator is a lattice discretization of the Dirac operator of quantum chromodynamics, the fundamental physical theory of the strong interaction between the quarks. As opposed to other discretizations it preserves the important physical property of chiral symmetry, at the expense of requiring much more effort when solving systems with this operator. We present a preconditioning technique based on another lattice discretization, the Wilson-Dirac operator. The mathematical analysis precisely describes the effect of this preconditioning in the case that the Wilson-Dirac operator is normal. Although this is not exactly the case in realistic settings, we show that current smearing techniques indeed drive the Wilson-Dirac operator towards normality, thus providing a motivation why our preconditioner works well in computational practice. Results of numerical experiments in physically relevant settings show that our preconditioning yields accelerations of up to one order of magnitude.