Andreas Weinmann

h-index15

22papers

128citations

Novelty39%

AI Score46

Ranked #36,304 of 194,257 authors (top 19%)#12,917 in CV (top 22%)

22 Papers

3.9CVApr 19, 2023

Automatic Individual Identification of Patterned Solitary Species Based on Unlabeled Video Data

Vanessa Suessle, Mimi Arandjelovic, Ammie K. Kalan et al.

The manual processing and analysis of videos from camera traps is time-consuming and includes several steps, ranging from the filtering of falsely triggered footage to identifying and re-identifying individuals. In this study, we developed a pipeline to automatically analyze videos from camera traps to identify individuals without requiring manual interaction. This pipeline applies to animal species with uniquely identifiable fur patterns and solitary behavior, such as leopards (Panthera pardus). We assumed that the same individual was seen throughout one triggered video sequence. With this assumption, multiple images could be assigned to an individual for the initial database filling without pre-labeling. The pipeline was based on well-established components from computer vision and deep learning, particularly convolutional neural networks (CNNs) and scale-invariant feature transform (SIFT) features. We augmented this basis by implementing additional components to substitute otherwise required human interactions. Based on the similarity between frames from the video material, clusters were formed that represented individuals bypassing the open set problem of the unknown total population. The pipeline was tested on a dataset of leopard videos collected by the Pan African Programme: The Cultured Chimpanzee (PanAf) and achieved a success rate of over 83% for correct matches between previously unknown individuals. The proposed pipeline can become a valuable tool for future conservation projects based on camera trap data, reducing the work of manual analysis for individual identification, when labeled data is unavailable.

1.2NAMay 25, 2016

Model-Based Reconstruction for Magnetic Particle Imaging in 2D and 3D

Thomas März, Andreas Weinmann

We contribute to the mathematical modeling and analysis of magnetic particle imaging which is a promising new in-vivo imaging modality. Concerning modeling, we develop a structured decomposition of the imaging process and extract its core part which we reveal to be common to all previous contributions in this context. The central contribution of this paper is the development of reconstruction formulae for MPI in 2D and 3D. Until now, in the multivariate setup, only time consuming measurement approaches are available, whereas reconstruction formulae are only available in 1D. The 2D and the 3D (describing the real world) reconstruction formulae which we derive here are significantly different from the 1D situation -- in particular there is no Dirac property in dimensions greater than one when the particle sizes approach zero. As a further result of our analysis, we conclude that the reconstruction problem in MPI is severely ill-posed. Finally, we obtain a model-based reconstruction algorithm.

5.2CVJul 6, 2024

BlessemFlood21: Advancing Flood Analysis with a High-Resolution Georeferenced Dataset for Humanitarian Aid Support

Vladyslav Polushko, Alexander Jenal, Jens Bongartz et al.

Floods are an increasingly common global threat, causing emergencies and severe damage to infrastructure. During crises, organisations such as the World Food Programme use remotely sensed imagery, typically obtained through drones, for rapid situational analysis to plan life-saving actions. Computer Vision tools are needed to support task force experts on-site in the evaluation of the imagery to improve their efficiency and to allocate resources strategically. We introduce the BlessemFlood21 dataset to stimulate research on efficient flood detection tools. The imagery was acquired during the 2021 Erftstadt-Blessem flooding event and consists of high-resolution and georeferenced RGB-NIR images. In the resulting RGB dataset, the images are supplemented with detailed water masks, obtained via a semi-supervised human-in-the-loop technique, where in particular the NIR information is leveraged to classify pixels as either water or non-water. We evaluate our dataset by training and testing established Deep Learning models for semantic segmentation. With BlessemFlood21 we provide labeled high-resolution RGB data and a baseline for further development of algorithmic solutions tailored to flood detection in RGB imagery.

5.6CVApr 13

Using Deep Learning Models Pretrained by Self-Supervised Learning for Protein Localization

Ben Isselmann, Dilara Göksu, Heinz Neumann et al.

Background: Task-specific microscopy datasets are often small, making it difficult to train deep learning models that learn robust features. While self-supervised learning (SSL) has shown promise through pretraining on large, domain-specific datasets, generalizability across datasets with differing staining protocols and channel configurations remains underexplored. We investigated the generalizability of SSL models pretrained on ImageNet-1k and HPA FOV, evaluating their embeddings on OpenCell with and without fine-tuning, two channel-mismatch strategies, and varying fine-tuning data fractions. We additionally analyzed single-cell embeddings on a labeled OpenCell subset. Result: DINO-based ViT backbones pretrained on HPA FOV or ImageNet-1k transfer well to OpenCell even without fine-tuning. The HPA FOV-pretrained model achieved the highest zero-shot performance (macro $F_1$ 0.822 $\pm$ 0.007). Fine-tuning further improved performance to 0.860 $\pm$ 0.013. At the single-cell level, the HPA single-cell-pretrained model achieved the highest k-nearest neighbor performance across all neighborhood sizes (macro $F_1$ $\geq$ 0.796). Conclusion: SSL methods like DINO, pretrained on large domain-relevant datasets, enable effective use of deep learning features for fine-tuning on small, task-specific microscopy datasets.

5.7NAMar 12

Debye Relaxation in Model-Based Multi-Dimensional Magnetic Particle Imaging

Vladyslav Gapyak, Thomas März, Andreas Weinmann

Model-based reconstruction approaches for the medical imaging modality Magnetic Particle Imaging (MPI) are typically based on the Langevin model, which assumes instantaneous alignment of the particles magnetic momenta with the applied field. Regarding the application to real data, Langevin model-based reconstruction methods require model transfer functions (MTF) obtained from calibrations to preprocess the data. There are also model-based reconstruction approaches that include relaxation effects and other particle-level dynamics. However, they are limited either to 1D or 1D-like scanning scenarios when considering real data, or are limited to simulated data in the case of multi-dimensional field-free point (FFP) MPI. Thus, fully model-based reconstructions from multi-dimensional FFP scanning data that incorporate relaxation effects without using an MTF have not yet been demonstrated. In this work, we incorporate relaxation effects by considering a multi-dimensional Debye model and provide reconstruction formulae. In particular, we show that the Debye model-based signal is the response of a linear time-invariant system with exponential memory applied to a Langevin model-based signal. We provide a reconstruction algorithm for the introduced multi-dimensional Debye model. To this end, we devise a relaxation adaption step. For the resulting relaxation-adapted Debye signal, we show that it can be expressed by the well-studied MPI core operator derived from the Langevin theory. This results in a three-stage algorithm with low additional cost over the Langevin model, as the relaxation adaption scales linearly in the input data. We provide numerical results for the proposed algorithmic approach. In particular, we obtain fully model-based reconstructions from real 2D MPI data without involving any specific MTF analogous to the Langevin model case.

2.8CVJan 12

Vision-Language Model for Accurate Crater Detection

Patrick Bauer, Marius Schwinning, Florian Renk et al.

The European Space Agency (ESA), driven by its ambitions on planned lunar missions with the Argonaut lander, has a profound interest in reliable crater detection, since craters pose a risk to safe lunar landings. This task is usually addressed with automated crater detection algorithms (CDA) based on deep learning techniques. It is non-trivial due to the vast amount of craters of various sizes and shapes, as well as challenging conditions such as varying illumination and rugged terrain. Therefore, we propose a deep-learning CDA based on the OWLv2 model, which is built on a Vision Transformer, that has proven highly effective in various computer vision tasks. For fine-tuning, we utilize a manually labeled dataset fom the IMPACT project, that provides crater annotations on high-resolution Lunar Reconnaissance Orbiter Camera Calibrated Data Record images. We insert trainable parameters using a parameter-efficient fine-tuning strategy with Low-Rank Adaptation, and optimize a combined loss function consisting of Complete Intersection over Union (CIoU) for localization and a contrastive loss for classification. We achieve satisfactory visual results, along with a maximum recall of 94.0% and a maximum precision of 73.1% on a test dataset from IMPACT. Our method achieves reliable crater detection across challenging lunar imaging conditions, paving the way for robust crater analysis in future lunar exploration.

10.5CVNov 8, 2024

SynDroneVision: A Synthetic Dataset for Image-Based Drone Detection

Tamara R. Lenhard, Andreas Weinmann, Kai Franke et al.

Developing robust drone detection systems is often constrained by the limited availability of large-scale annotated training data and the high costs associated with real-world data collection. However, leveraging synthetic data generated via game engine-based simulations provides a promising and cost-effective solution to overcome this issue. Therefore, we present SynDroneVision, a synthetic dataset specifically designed for RGB-based drone detection in surveillance applications. Featuring diverse backgrounds, lighting conditions, and drone models, SynDroneVision offers a comprehensive training foundation for deep learning algorithms. To evaluate the dataset's effectiveness, we perform a comparative analysis across a selection of recent YOLO detection models. Our findings demonstrate that SynDroneVision is a valuable resource for real-world data enrichment, achieving notable enhancements in model performance and robustness, while significantly reducing the time and costs of real-world data acquisition. SynDroneVision will be publicly released upon paper acceptance.

8.5IVDec 30, 2023

An $\ell^1$-Plug-and-Play Approach for MPI Using a Zero Shot Denoiser with Evaluation on the 3D Open MPI Dataset

Vladyslav Gapyak, Corinna Rentschler, Thomas März et al.

Objective: Magnetic particle imaging (MPI) is an emerging medical imaging modality which has gained increasing interest in recent years. Among the benefits of MPI are its high temporal resolution, and that the technique does not expose the specimen to any kind of ionizing radiation. It is based on the non-linear response of magnetic nanoparticles to an applied magnetic field. From the electric signal measured in receive coils, the particle concentration has to be reconstructed. Due to the ill-posedness of the reconstruction problem, various regularization methods have been proposed for reconstruction ranging from early stopping methods, via classical Tikhonov regularization and iterative methods to modern machine learning approaches. In this work, we contribute to the latter class: we propose a plug-and-play approach based on a generic zero-shot denoiser with an $\ell^1$-prior. Approach: We validate the reconstruction parameters of the method on a hybrid dataset and compare it with the baseline Tikhonov, DIP and the previous PP-MPI, which is a plug-and-play method with denoiser trained on MPI-friendly data. Main results: We offer a quantitative and qualitative evaluation of the zero-shot plug-and-play approach on the 3D Open MPI dataset. Moreover, we show the quality of the approach with different levels of preprocessing of the data. Significance: The proposed method employs a zero-shot denoiser which has not been trained for the MPI task and therefore saves the cost for training. Moreover, it offers a method that can be potentially applied in future MPI contexts.

6.2CVMay 28, 2025

Fast Trajectory-Independent Model-Based Reconstruction Algorithm for Multi-Dimensional Magnetic Particle Imaging

Vladyslav Gapyak, Thomas März, Andreas Weinmann

Magnetic Particle Imaging (MPI) is a promising tomographic technique for visualizing the spatio-temporal distribution of superparamagnetic nanoparticles, with applications ranging from cancer detection to real-time cardiovascular monitoring. Traditional MPI reconstruction relies on either time-consuming calibration (measured system matrix) or model-based simulation of the forward operator. Recent developments have shown the applicability of Chebyshev polynomials to multi-dimensional Lissajous Field-Free Point (FFP) scans. This method is bound to the particular choice of sinusoidal scanning trajectories. In this paper, we present the first reconstruction on real 2D MPI data with a trajectory-independent model-based MPI reconstruction algorithm. We further develop the zero-shot Plug-and-Play (PnP) algorithm of the authors -- with automatic noise level estimation -- to address the present deconvolution problem, leveraging a state-of-the-art denoiser trained on natural images without retraining on MPI-specific data. We evaluate our method on the publicly available 2D FFP MPI dataset ``MPIdata: Equilibrium Model with Anisotropy", featuring scans of six phantoms acquired using a Bruker preclinical scanner. Moreover, we show reconstruction performed on custom data on a 2D scanner with additional high-frequency excitation field and partial data. Our results demonstrate strong reconstruction capabilities across different scanning scenarios -- setting a precedent for general-purpose, flexible model-based MPI reconstruction.

2.0CVApr 8, 2024

CNN-based Game State Detection for a Foosball Table

David Hagens, Jan M. Knaup, Elke Hergenröther et al.

The automation of games using Deep Reinforcement Learning Strategies (DRL) is a well-known challenge in AI research. While for feature extraction in a video game typically the whole image is used, this is hardly practical for many real world games. Instead, using a smaller game state reducing the dimension of the parameter space to include essential parameters only seems to be a promising approach. In the game of Foosball, a compact and comprehensive game state description consists of the positional shifts and rotations of the figures and the position of the ball over time. In particular, velocities and accelerations can be derived from consecutive time samples of the game state. In this paper, a figure detection system to determine the game state in Foosball is presented. We capture a dataset containing the rotations of the rods which were measured using accelerometers and the positional shifts were derived using traditional Computer Vision techniques (in a laboratory setting). This dataset is utilized to train Convolutional Neural Network (CNN) based end-to-end regression models to predict the rotations and shifts of each rod. We present an evaluation of our system using different state-of-the-art CNNs as base architectures for the regression model. We show that our system is able to predict the game state with high accuracy. By providing data for both black and white teams, the presented system is intended to provide the required data for future developments of Imitation Learning techniques w.r.t. to observing human players.

6.3IVMar 6, 2024

Enhanced Low-Dose CT Image Reconstruction by Domain and Task Shifting Gaussian Denoisers

Tim Selig, Thomas März, Martin Storath et al.

Computed tomography from a low radiation dose (LDCT) is challenging due to high noise in the projection data. Popular approaches for LDCT image reconstruction are two-stage methods, typically consisting of the filtered backprojection (FBP) algorithm followed by a neural network for LDCT image enhancement. Two-stage methods are attractive for their simplicity and potential for computational efficiency, typically requiring only a single FBP and a neural network forward pass for inference. However, the best reconstruction quality is currently achieved by unrolled iterative methods (Learned Primal-Dual and ItNet), which are more complex and thus have a higher computational cost for training and inference. We propose a method combining the simplicity and efficiency of two-stage methods with state-of-the-art reconstruction quality. Our strategy utilizes a neural network pretrained for Gaussian noise removal from natural grayscale images, fine-tuned for LDCT image enhancement. We call this method FBP-DTSGD (Domain and Task Shifted Gaussian Denoisers) as the fine-tuning is a task shift from Gaussian denoising to enhancing LDCT images and a domain shift from natural grayscale to LDCT images. An ablation study with three different pretrained Gaussian denoisers indicates that the performance of FBP-DTSGD does not depend on a specific denoising architecture, suggesting future advancements in Gaussian denoising could benefit the method. The study also shows that pretraining on natural images enhances LDCT reconstruction quality, especially with limited training data. Notably, pretraining involves no additional cost, as existing pretrained models are used. The proposed method currently holds the top mean position in the LoDoPaB-CT challenge.

6.2CVSep 17, 2025

Performance Optimization of YOLO-FEDER FusionNet for Robust Drone Detection in Visually Complex Environments

Tamara R. Lenhard, Andreas Weinmann, Tobias Koch

Drone detection in visually complex environments remains challenging due to background clutter, small object scale, and camouflage effects. While generic object detectors like YOLO exhibit strong performance in low-texture scenes, their effectiveness degrades in cluttered environments with low object-background separability. To address these limitations, this work presents an enhanced iteration of YOLO-FEDER FusionNet -- a detection framework that integrates generic object detection with camouflage object detection techniques. Building upon the original architecture, the proposed iteration introduces systematic advancements in training data composition, feature fusion strategies, and backbone design. Specifically, the training process leverages large-scale, photo-realistic synthetic data, complemented by a small set of real-world samples, to enhance robustness under visually complex conditions. The contribution of intermediate multi-scale FEDER features is systematically evaluated, and detection performance is comprehensively benchmarked across multiple YOLO-based backbone configurations. Empirical results indicate that integrating intermediate FEDER features, in combination with backbone upgrades, contributes to notable performance improvements. In the most promising configuration -- YOLO-FEDER FusionNet with a YOLOv8l backbone and FEDER features derived from the DWD module -- these enhancements lead to a FNR reduction of up to 39.1 percentage points and a mAP increase of up to 62.8 percentage points at an IoU threshold of 0.5, compared to the initial baseline.

3.6CVApr 28, 2025

Remote Sensing Imagery for Flood Detection: Exploration of Augmentation Strategies

Vladyslav Polushko, Damjan Hatic, Ronald Rösch et al.

Floods cause serious problems around the world. Responding quickly and effectively requires accurate and timely information about the affected areas. The effective use of Remote Sensing images for accurate flood detection requires specific detection methods. Typically, Deep Neural Networks are employed, which are trained on specific datasets. For the purpose of river flood detection in RGB imagery, we use the BlessemFlood21 dataset. We here explore the use of different augmentation strategies, ranging from basic approaches to more complex techniques, including optical distortion. By identifying effective strategies, we aim to refine the training process of state-of-the-art Deep Learning segmentation networks.

2.0CVJun 19, 2024

CNN Based Flank Predictor for Quadruped Animal Species

Vanessa Suessle, Marco Heurich, Colleen T. Downs et al.

The bilateral asymmetry of flanks of animals with visual body marks that uniquely identify an individual, complicates tasks like population estimations. Automatically generated additional information on the visible side of the animal would improve the accuracy for individual identification. In this study we used transfer learning on popular CNN image classification architectures to train a flank predictor that predicts the visible flank of quadruped mammalian species in images. We automatically derived the data labels from existing datasets originally labeled for animal pose estimation. We trained the models in two phases with different degrees of retraining. The developed models were evaluated in different scenarios of different unknown quadruped species in known and unknown environments. As a real-world scenario, we used a dataset of manually labeled Eurasian lynx (Lynx lynx) from camera traps in the Bavarian Forest National Park to evaluate the model. The best model, trained on an EfficientNetV2 backbone, achieved an accuracy of 88.70 % for the unknown species lynx in a complex habitat.

4.9SDJun 19, 2024

Automated Bioacoustic Monitoring for South African Bird Species on Unlabeled Data

Michael Doell, Dominik Kuehn, Vanessa Suessle et al.

Analyses for biodiversity monitoring based on passive acoustic monitoring (PAM) recordings is time-consuming and challenged by the presence of background noise in recordings. Existing models for sound event detection (SED) worked only on certain avian species and the development of further models required labeled data. The developed framework automatically extracted labeled data from available platforms for selected avian species. The labeled data were embedded into recordings, including environmental sounds and noise, and were used to train convolutional recurrent neural network (CRNN) models. The models were evaluated on unprocessed real world data recorded in urban KwaZulu-Natal habitats. The Adapted SED-CRNN model reached a F1 score of 0.73, demonstrating its efficiency under noisy, real-world conditions. The proposed approach to automatically extract labeled data for chosen avian species enables an easy adaption of PAM to other species and habitats for future conservation projects.

6.5CVJun 17, 2024

YOLO-FEDER FusionNet: A Novel Deep Learning Architecture for Drone Detection

Tamara R. Lenhard, Andreas Weinmann, Stefan Jäger et al.

Predominant methods for image-based drone detection frequently rely on employing generic object detection algorithms like YOLOv5. While proficient in identifying drones against homogeneous backgrounds, these algorithms often struggle in complex, highly textured environments. In such scenarios, drones seamlessly integrate into the background, creating camouflage effects that adversely affect the detection quality. To address this issue, we introduce a novel deep learning architecture called YOLO-FEDER FusionNet. Unlike conventional approaches, YOLO-FEDER FusionNet combines generic object detection methods with the specialized strength of camouflage object detection techniques to enhance drone detection capabilities. Comprehensive evaluations of YOLO-FEDER FusionNet show the efficiency of the proposed model and demonstrate substantial improvements in both reducing missed detections and false alarms.

1.2NASep 12, 2020

Multi-Channel Potts-Based Reconstruction for Multi-Spectral Computed Tomography

Lukas Kiefer, Stefania Petra, Martin Storath et al.

We consider reconstructing multi-channel images from measurements performed by photon-counting and energy-discriminating detectors in the setting of multi-spectral X-ray computed tomography (CT). Our aim is to exploit the strong structural correlation that is known to exist between the channels of multi-spectral CT images. To that end, we adopt the multi-channel Potts prior to jointly reconstruct all channels. This prior produces piecewise constant solutions with strongly correlated channels. In particular, edges are enforced to have the same spatial position across channels which is a benefit over TV-based methods. We consider the Potts prior in two frameworks: (a) in the context of a variational Potts model, and (b) in a Potts-superiorization approach that perturbs the iterates of a basic iterative least squares solver. We identify an alternating direction method of multipliers (ADMM) approach as well as a Potts-superiorized conjugate gradient method as particularly suitable. In numerical experiments, we compare the Potts prior based approaches to existing TV-type approaches on realistically simulated multi-spectral CT data and obtain improved reconstruction for compound solid bodies.

1.2NADec 3, 2018

Iterative Potts minimization for the recovery of signals with discontinuities from indirect measurements -- the multivariate case

Lukas Kiefer, Martin Storath, Andreas Weinmann

Signals and images with discontinuities appear in many problems in such diverse areas as biology, medicine, mechanics, and electrical engineering. The concrete data are often discrete, indirect and noisy measurements of some quantities describing the signal under consideration. A frequent task is to find the segments of the signal or image which corresponds to finding the discontinuities or jumps in the data. Methods based on minimizing the piecewise constant Mumford-Shah functional -- whose discretized version is known as Potts functional -- are advantageous in this scenario, in particular, in connection with segmentation. However, due to their non-convexity, minimization of such functionals is challenging. In this paper we propose a new iterative minimization strategy for the multivariate Potts functional dealing with indirect, noisy measurements. We provide a convergence analysis and underpin our findings with numerical experiments.

2.3NAAug 1, 2018

Wavelet Sparse Regularization for Manifold-Valued Data

Martin Storath, Andreas Weinmann

In this paper, we consider the sparse regularization of manifold-valued data with respect to an interpolatory wavelet/multiscale transform. We propose and study variational models for this task and provide results on their well-posedness. We present algorithms for a numerical realization of these models in the manifold setup. Further, we provide experimental results to show the potential of the proposed schemes for applications.

4.3NAApr 27, 2018

Variational Regularization of Inverse Problems for Manifold-Valued Data

Martin Storath, Andreas Weinmann

In this paper, we consider the variational regularization of manifold-valued data in the inverse problems setting. In particular, we consider TV and TGV regularization for manifold-valued data with indirect measurement operators. We provide results on the well-posedness and present algorithms for a numerical realization of these models in the manifold setup. Further, we provide experimental results for synthetic and real data to show the potential of the proposed schemes for applications.

1.7CVFeb 6, 2018

Fast Piecewise-Affine Motion Estimation Without Segmentation

Denis Fortun, Martin Storath, Dennis Rickert et al.

Current algorithmic approaches for piecewise affine motion estimation are based on alternating motion segmentation and estimation. We propose a new method to estimate piecewise affine motion fields directly without intermediate segmentation. To this end, we reformulate the problem by imposing piecewise constancy of the parameter field, and derive a specific proximal splitting optimization scheme. A key component of our framework is an efficient one-dimensional piecewise-affine estimator for vector-valued signals. The first advantage of our approach over segmentation-based methods is its absence of initialization. The second advantage is its lower computational cost which is independent of the complexity of the motion field. In addition to these features, we demonstrate competitive accuracy with other piecewise-parametric methods on standard evaluation benchmarks. Our new regularization scheme also outperforms the more standard use of total variation and total generalized variation.

5.1NAOct 7, 2014

Mumford-Shah and Potts Regularization for Manifold-Valued Data with Applications to DTI and Q-Ball Imaging

Andreas Weinmann, Laurent Demaret, Martin Storath

Mumford-Shah and Potts functionals are powerful variational models for regularization which are widely used in signal and image processing; typical applications are edge-preserving denoising and segmentation. Being both non-smooth and non-convex, they are computationally challenging even for scalar data. For manifold-valued data, the problem becomes even more involved since typical features of vector spaces are not available. In this paper, we propose algorithms for Mumford-Shah and for Potts regularization of manifold-valued signals and images. For the univariate problems, we derive solvers based on dynamic programming combined with (convex) optimization techniques for manifold-valued data. For the class of Cartan-Hadamard manifolds (which includes the data space in diffusion tensor imaging), we show that our algorithms compute global minimizers for any starting point. For the multivariate Mumford-Shah and Potts problems (for image regularization) we propose a splitting into suitable subproblems which we can solve exactly using the techniques developed for the corresponding univariate problems. Our method does not require any a priori restrictions on the edge set and we do not have to discretize the data space. We apply our method to diffusion tensor imaging (DTI) as well as Q-ball imaging. Using the DTI model, we obtain a segmentation of the corpus callosum.