Marco F. Huber

LG
h-index11
47papers
1,636citations
Novelty45%
AI Score55

47 Papers

LGJun 21, 2023
Automated Machine Learning for Remaining Useful Life Predictions

Marc-André Zöller, Fabian Mauthe, Peter Zeiler et al.

Being able to predict the remaining useful life (RUL) of an engineering system is an important task in prognostics and health management. Recently, data-driven approaches to RUL predictions are becoming prevalent over model-based approaches since no underlying physical knowledge of the engineering system is required. Yet, this just replaces required expertise of the underlying physics with machine learning (ML) expertise, which is often also not available. Automated machine learning (AutoML) promises to build end-to-end ML pipelines automatically enabling domain experts without ML expertise to create their own models. This paper introduces AutoRUL, an AutoML-driven end-to-end approach for automatic RUL predictions. AutoRUL combines fine-tuned standard regression methods to an ensemble with high predictive power. By evaluating the proposed method on eight real-world and synthetic datasets against state-of-the-art hand-crafted models, we show that AutoML provides a viable alternative to hand-crafted data-driven RUL predictions. Consequently, creating RUL predictions can be made more accessible for domain experts using AutoML by eliminating ML expertise from data-driven model construction.

SYMar 30, 2012
Adaptive Gaussian Mixture Filter Based on Statistical Linearization

Marco F. Huber

Gaussian mixtures are a common density representation in nonlinear, non-Gaussian Bayesian state estimation. Selecting an appropriate number of Gaussian components, however, is difficult as one has to trade of computational complexity against estimation accuracy. In this paper, an adaptive Gaussian mixture filter based on statistical linearization is proposed. Depending on the nonlinearity of the considered estimation problem, this filter dynamically increases the number of components via splitting. For this purpose, a measure is introduced that allows for quantifying the locally induced linearization error at each Gaussian mixture component. The deviation between the nonlinear and the linearized state space model is evaluated for determining the splitting direction. The proposed approach is not restricted to a specific statistical linearization method. Simulations show the superior estimation performance compared to related approaches and common filtering algorithms.

CVFeb 16, 2023
Defect Transfer GAN: Diverse Defect Synthesis for Data Augmentation

Ruyu Wang, Sabrina Hoppe, Eduardo Monari et al.

Data-hunger and data-imbalance are two major pitfalls in many deep learning approaches. For example, on highly optimized production lines, defective samples are hardly acquired while non-defective samples come almost for free. The defects however often seem to resemble each other, e.g., scratches on different products may only differ in a few characteristics. In this work, we introduce a framework, Defect Transfer GAN (DT-GAN), which learns to represent defect types independent of and across various background products and yet can apply defect-specific styles to generate realistic defective images. An empirical study on the MVTec AD and two additional datasets showcase DT-GAN outperforms state-of-the-art image synthesis methods w.r.t. sample fidelity and diversity in defect generation. We further demonstrate benefits for a critical downstream task in manufacturing -- defect classification. Results show that the augmented data from DT-GAN provides consistent gains even in the few samples regime and reduces the error rate up to 51% compared to both traditional and advanced data augmentation methods.

NEAug 23, 2022
A Nested Genetic Algorithm for Explaining Classification Data Sets with Decision Rules

Paul-Amaury Matt, Rosina Ziegler, Danilo Brajovic et al.

Our goal in this paper is to automatically extract a set of decision rules (rule set) that best explains a classification data set. First, a large set of decision rules is extracted from a set of decision trees trained on the data set. The rule set should be concise, accurate, have a maximum coverage and minimum number of inconsistencies. This problem can be formalized as a modified version of the weighted budgeted maximum coverage problem, known to be NP-hard. To solve the combinatorial optimization problem efficiently, we introduce a nested genetic algorithm which we then use to derive explanations for ten public data sets.

LGNov 26, 2022
Mixture of Decision Trees for Interpretable Machine Learning

Simeon Brüggenjürgen, Nina Schaaf, Pascal Kerschke et al.

This work introduces a novel interpretable machine learning method called Mixture of Decision Trees (MoDT). It constitutes a special case of the Mixture of Experts ensemble architecture, which utilizes a linear model as gating function and decision trees as experts. Our proposed method is ideally suited for problems that cannot be satisfactorily learned by a single decision tree, but which can alternatively be divided into subproblems. Each subproblem can then be learned well from a single decision tree. Therefore, MoDT can be considered as a method that improves performance while maintaining interpretability by making each of its decisions understandable and traceable to humans. Our work is accompanied by a Python implementation, which uses an interpretable gating function, a fast learning algorithm, and a direct interface to fine-tuned interpretable visualization methods. The experiments confirm that the implementation works and, more importantly, show the superiority of our approach compared to single decision trees and random forests of similar complexity.

LGNov 12, 2025Code
Efficiently Transforming Neural Networks into Decision Trees: A Path to Ground Truth Explanations with RENTT

Helena Monke, Benjamin Fresz, Marco Bernreuther et al.

Although neural networks are a powerful tool, their widespread use is hindered by the opacity of their decisions and their black-box nature, which result in a lack of trustworthiness. To alleviate this problem, methods in the field of explainable Artificial Intelligence try to unveil how such automated decisions are made. But explainable AI methods are often plagued by missing faithfulness/correctness, meaning that they sometimes provide explanations that do not align with the neural network's decision and logic. Recently, transformations to decision trees have been proposed to overcome such problems. Unfortunately, they typically lack exactness, scalability, or interpretability as the size of the neural network grows. Thus, we generalize these previous results, especially by considering convolutional neural networks, recurrent neural networks, non-ReLU activation functions, and bias terms. Our findings are accompanied by rigorous proofs and we present a novel algorithm RENTT (Runtime Efficient Network to Tree Transformation) designed to compute an exact equivalent decision tree representation of neural networks in a manner that is both runtime and memory efficient. The resulting decision trees are multivariate and thus, possibly too complex to understand. To alleviate this problem, we also provide a method to calculate the ground truth feature importance for neural networks via the equivalent decision trees - for entire models (global), specific input regions (regional), or single decisions (local). All theoretical results are supported by detailed numerical experiments that emphasize two key aspects: the computational efficiency and scalability of our algorithm, and that only RENTT succeeds in uncovering ground truth explanations compared to conventional approximation methods like LIME and SHAP. All code is available at https://github.com/HelenaM23/RENTT .

LGNov 11, 2025Code
From Confusion to Clarity: ProtoScore -- A Framework for Evaluating Prototype-Based XAI

Helena Monke, Benjamin Sae-Chew, Benjamin Fresz et al.

The complexity and opacity of neural networks (NNs) pose significant challenges, particularly in high-stakes fields such as healthcare, finance, and law, where understanding decision-making processes is crucial. To address these issues, the field of explainable artificial intelligence (XAI) has developed various methods aimed at clarifying AI decision-making, thereby facilitating appropriate trust and validating the fairness of outcomes. Among these methods, prototype-based explanations offer a promising approach that uses representative examples to elucidate model behavior. However, a critical gap exists regarding standardized benchmarks to objectively compare prototype-based XAI methods, especially in the context of time series data. This lack of reliable benchmarks results in subjective evaluations, hindering progress in the field. We aim to establish a robust framework, ProtoScore, for assessing prototype-based XAI methods across different data types with a focus on time series data, facilitating fair and comprehensive evaluations. By integrating the Co-12 properties of Nauta et al., this framework allows for effectively comparing prototype methods against each other and against other XAI methods, ultimately assisting practitioners in selecting appropriate explanation methods while minimizing the costs associated with user studies. All code is publicly available at https://github.com/HelenaM23/ProtoScore .

AIJul 21, 2023
Model Reporting for Certifiable AI: A Proposal from Merging EU Regulation into AI Development

Danilo Brajovic, Niclas Renner, Vincent Philipp Goebels et al.

Despite large progress in Explainable and Safe AI, practitioners suffer from a lack of regulation and standards for AI safety. In this work we merge recent regulation efforts by the European Union and first proposals for AI guidelines with recent trends in research: data and model cards. We propose the use of standardized cards to document AI applications throughout the development process. Our main contribution is the introduction of use-case and operation cards, along with updates for data and model cards to cope with regulatory requirements. We reference both recent research as well as the source of the regulation in our cards and provide references to additional support material and toolboxes whenever possible. The goal is to design cards that help practitioners develop safe AI systems throughout the development process, while enabling efficient third-party auditing of AI applications, being easy to understand, and building trust in the system. Our work incorporates insights from interviews with certification experts as well as developers and individuals working with the developed AI applications.

CVJul 6, 2023
Self-supervised Optimization of Hand Pose Estimation using Anatomical Features and Iterative Learning

Christian Jauch, Timo Leitritz, Marco F. Huber

Manual assembly workers face increasing complexity in their work. Human-centered assistance systems could help, but object recognition as an enabling technology hinders sophisticated human-centered design of these systems. At the same time, activity recognition based on hand poses suffers from poor pose estimation in complex usage scenarios, such as wearing gloves. This paper presents a self-supervised pipeline for adapting hand pose estimation to specific use cases with minimal human interaction. This enables cheap and robust hand posebased activity recognition. The pipeline consists of a general machine learning model for hand pose estimation trained on a generalized dataset, spatial and temporal filtering to account for anatomical constraints of the hand, and a retraining step to improve the model. Different parameter combinations are evaluated on a publicly available and annotated dataset. The best parameter and model combination is then applied to unlabelled videos from a manual assembly scenario. The effectiveness of the pipeline is demonstrated by training an activity recognition as a downstream task in the manual assembly scenario.

CVNov 7, 2023
Improving the Effectiveness of Deep Generative Data

Ruyu Wang, Sabrina Schmedding, Marco F. Huber

Recent deep generative models (DGMs) such as generative adversarial networks (GANs) and diffusion probabilistic models (DPMs) have shown their impressive ability in generating high-fidelity photorealistic images. Although looking appealing to human eyes, training a model on purely synthetic images for downstream image processing tasks like image classification often results in an undesired performance drop compared to training on real data. Previous works have demonstrated that enhancing a real dataset with synthetic images from DGMs can be beneficial. However, the improvements were subjected to certain circumstances and yet were not comparable to adding the same number of real images. In this work, we propose a new taxonomy to describe factors contributing to this commonly observed phenomenon and investigate it on the popular CIFAR-10 dataset. We hypothesize that the Content Gap accounts for a large portion of the performance drop when using synthetic images from DGM and propose strategies to better utilize them in downstream tasks. Extensive experiments on multiple datasets showcase that our method outperforms baselines on downstream classification tasks both in case of training on synthetic only (Synthetic-to-Real) and training on a mix of real and synthetic data (Data Augmentation), particularly in the data-scarce scenario.

CYJul 22, 2024
The Contribution of XAI for the Safe Development and Certification of AI: An Expert-Based Analysis

Benjamin Fresz, Vincent Philipp Göbels, Safa Omri et al.

Developing and certifying safe - or so-called trustworthy - AI has become an increasingly salient issue, especially in light of upcoming regulation such as the EU AI Act. In this context, the black-box nature of machine learning models limits the use of conventional avenues of approach towards certifying complex technical systems. As a potential solution, methods to give insights into this black-box - devised in the field of eXplainable AI (XAI) - could be used. In this study, the potential and shortcomings of such methods for the purpose of safe AI development and certification are discussed in 15 qualitative interviews with experts out of the areas of (X)AI and certification. We find that XAI methods can be a helpful asset for safe AI development, as they can show biases and failures of ML-models, but since certification relies on comprehensive and correct information about technical systems, their impact is expected to be limited.

AIMay 11
Constraint-Data-Value-Maximization: Utilizing Data Attribution for Effective Data Pruning in Low-Data Environments

Danilo Brajovic, David A. Kreplin, Marco F. Huber

Attributing model behavior to training data is an evolving research field. A common benchmark is data removal, which involves eliminating data instances with either low or high values, then assessing a model's performance trained on the modified dataset. Many existing studies leverage Shapley-based data values for this task. In this paper, we demonstrate that these data values are not optimally suited for pruning low-value data when only a limited amount of data remains. To address this limitation, we introduce the Constraint-Data-Value-Maximization (CDVM) approach, which effectively utilizes data attributions for pruning in low-data scenarios. By casting pruning as a constrained optimization that both maximizes total influence and penalizes excessive per-test contributions, CDVM delivers robust performance when only a small fraction of the data is retained. On the OpenDataVal benchmark, CDVM shows strong performance and competitive runtime.

LGMay 21, 2023Code
Towards Optimal Energy Management Strategy for Hybrid Electric Vehicle with Reinforcement Learning

Xinyang Wu, Elisabeth Wedernikow, Christof Nitsche et al.

In recent years, the development of Artificial Intelligence (AI) has shown tremendous potential in diverse areas. Among them, reinforcement learning (RL) has proven to be an effective solution for learning intelligent control strategies. As an inevitable trend for mitigating climate change, hybrid electric vehicles (HEVs) rely on efficient energy management strategies (EMS) to minimize energy consumption. Many researchers have employed RL to learn optimal EMS for specific vehicle models. However, most of these models tend to be complex and proprietary, making them unsuitable for broad applicability. This paper presents a novel framework, in which we implement and integrate RL-based EMS with the open-source vehicle simulation tool called FASTSim. The learned RL-based EMSs are evaluated on various vehicle models using different test drive cycles and prove to be effective in improving energy efficiency.

LGDec 15, 2025
From Overfitting to Reliability: Introducing the Hierarchical Approximate Bayesian Neural Network

Hayk Amirkhanian, Marco F. Huber

In recent years, neural networks have revolutionized various domains, yet challenges such as hyperparameter tuning and overfitting remain significant hurdles. Bayesian neural networks offer a framework to address these challenges by incorporating uncertainty directly into the model, yielding more reliable predictions, particularly for out-of-distribution data. This paper presents Hierarchical Approximate Bayesian Neural Network, a novel approach that uses a Gaussian-inverse-Wishart distribution as a hyperprior of the network's weights to increase both the robustness and performance of the model. We provide analytical representations for the predictive distribution and weight posterior, which amount to the calculation of the parameters of Student's t-distributions in closed form with linear complexity with respect to the number of weights. Our method demonstrates robust performance, effectively addressing issues of overfitting and providing reliable uncertainty estimates, particularly for out-of-distribution tasks. Experimental results indicate that HABNN not only matches but often outperforms state-of-the-art models, suggesting a promising direction for future applications in safety-critical environments.

ROFeb 26, 2024
RoboGrind: Intuitive and Interactive Surface Treatment with Industrial Robots

Benjamin Alt, Florian Stöckl, Silvan Müller et al.

Surface treatment tasks such as grinding, sanding or polishing are a vital step of the value chain in many industries, but are notoriously challenging to automate. We present RoboGrind, an integrated system for the intuitive, interactive automation of surface treatment tasks with industrial robots. It combines a sophisticated 3D perception pipeline for surface scanning and automatic defect identification, an interactive voice-controlled wizard system for the AI-assisted bootstrapping and parameterization of robot programs, and an automatic planning and execution pipeline for force-controlled robotic surface treatment. RoboGrind is evaluated both under laboratory and real-world conditions in the context of refabricating fiberglass wind turbine blades.

LGDec 13, 2023
auto-sktime: Automated Time Series Forecasting

Marc-André Zöller, Marius Lindauer, Marco F. Huber

In today's data-driven landscape, time series forecasting is pivotal in decision-making across various sectors. Yet, the proliferation of more diverse time series data, coupled with the expanding landscape of available forecasting methods, poses significant challenges for forecasters. To meet the growing demand for efficient forecasting, we introduce auto-sktime, a novel framework for automated time series forecasting. The proposed framework uses the power of automated machine learning (AutoML) techniques to automate the creation of the entire forecasting pipeline. The framework employs Bayesian optimization, to automatically construct pipelines from statistical, machine learning (ML) and deep neural network (DNN) models. Furthermore, we propose three essential improvements to adapt AutoML to time series data. First, pipeline templates to account for the different supported forecasting models. Second, a novel warm-starting technique to start the optimization from prior optimization runs. Third, we adapt multi-fidelity optimizations to make them applicable to a search space containing statistical, ML and DNN models. Experimental results on 64 diverse real-world time series datasets demonstrate the effectiveness and efficiency of the framework, outperforming traditional methods while requiring minimal human involvement.

CVMar 15, 2025
STAY Diffusion: Styled Layout Diffusion Model for Diverse Layout-to-Image Generation

Ruyu Wang, Xuefeng Hou, Sabrina Schmedding et al.

In layout-to-image (L2I) synthesis, controlled complex scenes are generated from coarse information like bounding boxes. Such a task is exciting to many downstream applications because the input layouts offer strong guidance to the generation process while remaining easily reconfigurable by humans. In this paper, we proposed STyled LAYout Diffusion (STAY Diffusion), a diffusion-based model that produces photo-realistic images and provides fine-grained control of stylized objects in scenes. Our approach learns a global condition for each layout, and a self-supervised semantic map for weight modulation using a novel Edge-Aware Normalization (EA Norm). A new Styled-Mask Attention (SM Attention) is also introduced to cross-condition the global condition and image feature for capturing the objects' relationships. These measures provide consistent guidance through the model, enabling more accurate and controllable image generation. Extensive benchmarking demonstrates that our STAY Diffusion presents high-quality images while surpassing previous state-of-the-art methods in generation diversity, accuracy, and controllability.

QUANT-PHDec 12, 2024
Data Efficient Prediction of excited-state properties using Quantum Neural Networks

Manuel Hagelüken, Marco F. Huber, Marco Roth

Understanding the properties of excited states of complex molecules is crucial for many chemical and physical processes. Calculating these properties is often significantly more resource-intensive than calculating their ground state counterparts. We present a quantum machine learning model that predicts excited-state properties from the molecular ground state for different geometric configurations. The model comprises a symmetry-invariant quantum neural network and a conventional neural network and is able to provide accurate predictions with only a few training data points. The proposed procedure is fully NISQ compatible. This is achieved by using a quantum circuit that requires a number of parameters linearly proportional to the number of molecular orbitals, along with a parameterized measurement observable, thereby reducing the number of necessary measurements. We benchmark the algorithm on three different molecules with three different system sizes: $H_2$ with four orbitals, LiH with five orbitals, and $H_4$ with six orbitals. For these molecules, we predict the excited state transition energies and transition dipole moments. We show that, in many cases, the procedure is able to outperform various classical models (support vector machines, Gaussian processes, and neural networks) that rely solely on classical features, by up to two orders of magnitude in the test mean squared error.

CVAug 8, 2025
ViPro-2: Unsupervised State Estimation via Integrated Dynamics for Guiding Video Prediction

Patrick Takenaka, Johannes Maucher, Marco F. Huber

Predicting future video frames is a challenging task with many downstream applications. Previous work has shown that procedural knowledge enables deep models for complex dynamical settings, however their model ViPro assumed a given ground truth initial symbolic state. We show that this approach led to the model learning a shortcut that does not actually connect the observed environment with the predicted symbolic state, resulting in the inability to estimate states given an observation if previous states are noisy. In this work, we add several improvements to ViPro that enables the model to correctly infer states from observations without providing a full ground truth state in the beginning. We show that this is possible in an unsupervised manner, and extend the original Orbits dataset with a 3D variant to close the gap to real world scenarios.

LGJul 23, 2025
Causal Mechanism Estimation in Multi-Sensor Systems Across Multiple Domains

Jingyi Yu, Tim Pychynski, Marco F. Huber

To gain deeper insights into a complex sensor system through the lens of causality, we present common and individual causal mechanism estimation (CICME), a novel three-step approach to inferring causal mechanisms from heterogeneous data collected across multiple domains. By leveraging the principle of Causal Transfer Learning (CTL), CICME is able to reliably detect domain-invariant causal mechanisms when provided with sufficient samples. The identified common causal mechanisms are further used to guide the estimation of the remaining causal mechanisms in each domain individually. The performance of CICME is evaluated on linear Gaussian models under scenarios inspired from a manufacturing process. Building upon existing continuous optimization-based causal discovery methods, we show that CICME leverages the benefits of applying causal discovery on the pooled data and repeatedly on data from individual domains, and it even outperforms both baseline methods under certain scenarios.

CVApr 7, 2025
Generative Adversarial Networks with Limited Data: A Survey and Benchmarking

Omar De Mitri, Ruyu Wang, Marco F. Huber

Generative Adversarial Networks (GANs) have shown impressive results in various image synthesis tasks. Vast studies have demonstrated that GANs are more powerful in feature and expression learning compared to other generative models and their latent space encodes rich semantic information. However, the tremendous performance of GANs heavily relies on the access to large-scale training data and deteriorates rapidly when the amount of data is limited. This paper aims to provide an overview of GANs, its variants and applications in various vision tasks, focusing on addressing the limited data issue. We analyze state-of-the-art GANs in limited data regime with designed experiments, along with presenting various methods attempt to tackle this problem from different perspectives. Finally, we further elaborate on remaining challenges and trends for future research.

LGMar 20, 2025
Sample-Efficient Bayesian Transfer Learning for Online Machine Parameter Optimization

Philipp Wagner, Tobias Nagel, Philipp Leube et al.

Correctly setting the parameters of a production machine is essential to improve product quality, increase efficiency, and reduce production costs while also supporting sustainability goals. Identifying optimal parameters involves an iterative process of producing an object and evaluating its quality. Minimizing the number of iterations is, therefore, desirable to reduce the costs associated with unsuccessful attempts. This work introduces a method to optimize the machine parameters in the system itself using a Bayesian optimization algorithm. By leveraging existing machine data, we use a transfer learning approach in order to identify an optimum with minimal iterations, resulting in a cost-effective transfer learning algorithm. We validate our approach on a laser machine for cutting sheet metal in the real world.

CVJun 26, 2024
Guiding Video Prediction with Explicit Procedural Knowledge

Patrick Takenaka, Johannes Maucher, Marco F. Huber

We propose a general way to integrate procedural knowledge of a domain into deep learning models. We apply it to the case of video prediction, building on top of object-centric deep models and show that this leads to a better performance than using data-driven models alone. We develop an architecture that facilitates latent space disentanglement in order to use the integrated procedural knowledge, and establish a setup that allows the model to learn the procedural interface in the latent space using the downstream task of video prediction. We contrast the performance to a state-of-the-art data-driven approach and show that problems where purely data-driven approaches struggle can be handled by using knowledge about the domain, providing an alternative to simply collecting more data.

CVJun 26, 2024
ViPro: Enabling and Controlling Video Prediction for Complex Dynamical Scenarios using Procedural Knowledge

Patrick Takenaka, Johannes Maucher, Marco F. Huber

We propose a novel architecture design for video prediction in order to utilize procedural domain knowledge directly as part of the computational graph of data-driven models. On the basis of new challenging scenarios we show that state-of-the-art video predictors struggle in complex dynamical settings, and highlight that the introduction of prior process knowledge makes their learning problem feasible. Our approach results in the learning of a symbolically addressable interface between data-driven aspects in the model and our dedicated procedural knowledge module, which we utilize in downstream control tasks.

QUANT-PHJun 4, 2024
Reinforcement learning-based architecture search for quantum machine learning

Frederic Rapp, David A. Kreplin, Marco F. Huber et al.

Quantum machine learning models use encoding circuits to map data into a quantum Hilbert space. While it is well known that the architecture of these circuits significantly influences core properties of the resulting model, they are often chosen heuristically. In this work, we present a novel approach using reinforcement learning techniques to generate problem-specific encoding circuits to improve the performance of quantum machine learning models. By specifically using a model-based reinforcement learning algorithm, we reduce the number of necessary circuit evaluations during the search, providing a sample-efficient framework. In contrast to previous search algorithms, our method uses a layered circuit structure that significantly reduces the search space. Additionally, our approach can account for multiple objectives such as solution quality, hardware restrictions and circuit depth. We benchmark our tailored circuits against various reference models, including models with problem-agnostic circuits and classical models. Our results highlight the effectiveness of problem-specific encoding circuits in enhancing QML model performance.

LGFeb 24, 2022
XAutoML: A Visual Analytics Tool for Understanding and Validating Automated Machine Learning

Marc-André Zöller, Waldemar Titov, Thomas Schlegel et al.

In the last ten years, various automated machine learning (AutoM ) systems have been proposed to build end-to-end machine learning (ML) pipelines with minimal human interaction. Even though such automatically synthesized ML pipelines are able to achieve a competitive performance, recent studies have shown that users do not trust models constructed by AutoML due to missing transparency of AutoML systems and missing explanations for the constructed ML pipelines. In a requirements analysis study with 36 domain experts, data scientists, and AutoML researchers from different professions with vastly different expertise in ML, we collect detailed informational needs for AutoML. We propose XAutoML, an interactive visual analytics tool for explaining arbitrary AutoML optimization procedures and ML pipelines constructed by AutoML. XAutoML combines interactive visualizations with established techniques from explainable artificial intelligence (XAI) to make the complete AutoML procedure transparent and explainable. By integrating XAutoML with JupyterLab, experienced users can extend the visual analytics with ad-hoc visualizations based on information extracted from XAutoML. We validate our approach in a user study with the same diverse user group from the requirements analysis. All participants were able to extract useful information from XAutoML, leading to a significantly increased understanding of ML pipelines produced by AutoML and the AutoML optimization itself.

CVFeb 21, 2022
Simplified Learning of CAD Features Leveraging a Deep Residual Autoencoder

Raoul Schönhof, Jannes Elstner, Radu Manea et al.

In the domain of computer vision, deep residual neural networks like EfficientNet have set new standards in terms of robustness and accuracy. One key problem underlying the training of deep neural networks is the immanent lack of a sufficient amount of training data. The problem worsens especially if labels cannot be generated automatically, but have to be annotated manually. This challenge occurs for instance if expert knowledge related to 3D parts should be externalized based on example models. One way to reduce the necessary amount of labeled data may be the use of autoencoders, which can be learned in an unsupervised fashion without labeled data. In this work, we present a deep residual 3D autoencoder based on the EfficientNet architecture, intended for transfer learning tasks related to 3D CAD model assessment. For this purpose, we adopted EfficientNet to 3D problems like voxel models derived from a STEP file. Striving to reduce the amount of labeled 3D data required, the networks encoder can be utilized for transfer training.

SPNov 2, 2021
A MIMO Radar-Based Metric Learning Approach for Activity Recognition

Fady Aziz, Omar Metwally, Pascal Weller et al.

Human activity recognition is seen of great importance in the medical and surveillance fields. Radar has shown great feasibility for this field based on the captured micro-Doppler (μ-D) signatures. In this paper, a MIMO radar is used to formulate a novel micro-motion spectrogram for the angular velocity (μ-ω) in non-tangential scenarios. Combining both the μ-D and the μ-ω signatures have shown better performance. Classification accuracy of 88.9% was achieved based on a metric learning approach. The experimental setup was designed to capture micro-motion signatures on different aspect angles and line of sight (LOS). The utilized training dataset was of smaller size compared to the state-of-the-art techniques, where eight activities were captured. A few-shot learning approach is used to adapt the pre-trained model for fall detection. The final model has shown a classification accuracy of 86.42% for ten activities.

SPOct 16, 2021
A MIMO Radar-based Few-Shot Learning Approach for Human-ID

Pascal Weller, Fady Aziz, Sherif Abdulatif et al.

Radar for deep learning-based human identification has become a research area of increasing interest. It has been shown that micro-Doppler ($μ$-D) can reflect the walking behavior through capturing the periodic limbs' micro-motions. One of the main aspects is maximizing the number of included classes while considering the real-time and training dataset size constraints. In this paper, a multiple-input-multiple-output (MIMO) radar is used to formulate micro-motion spectrograms of the elevation angular velocity ($μ$-$ω$). The effectiveness of concatenating this newly-formulated spectrogram with the commonly used $μ$-D is investigated. To accommodate for non-constrained real walking motion, an adaptive cycle segmentation framework is utilized and a metric learning network is trained on half gait cycles ($\approx$ 0.5 s). Studies on the effects of various numbers of classes (5--20), different dataset sizes, and varying observation time windows 1--2 s are conducted. A non-constrained walking dataset of 22 subjects is collected with different aspect angles with respect to the radar. The proposed few-shot learning (FSL) approach achieves a classification error of 11.3 % with only 2 min of training data per subject.

ROOct 3, 2021
Precise Object Placement with Pose Distance Estimations for Different Objects and Grippers

Kilian Kleeberger, Jonathan Schnitzler, Muhammad Usman Khalid et al.

This paper introduces a novel approach for the grasping and precise placement of various known rigid objects using multiple grippers within highly cluttered scenes. Using a single depth image of the scene, our method estimates multiple 6D object poses together with an object class, a pose distance for object pose estimation, and a pose distance from a target pose for object placement for each automatically obtained grasp pose with a single forward pass of a neural network. By incorporating model knowledge into the system, our approach has higher success rates for grasping than state-of-the-art model-free approaches. Furthermore, our method chooses grasps that result in significantly more precise object placements than prior model-based work.

LGOct 3, 2021
Kalman Bayesian Neural Networks for Closed-form Online Learning

Philipp Wagner, Xinyang Wu, Marco F. Huber

Compared to point estimates calculated by standard neural networks, Bayesian neural networks (BNN) provide probability distributions over the output predictions and model parameters, i.e., the weights. Training the weight distribution of a BNN, however, is more involved due to the intractability of the underlying Bayesian inference problem and thus, requires efficient approximations. In this paper, we propose a novel approach for BNN learning via closed-form Bayesian inference. For this purpose, the calculation of the predictive distribution of the output and the update of the weight distribution are treated as Bayesian filtering and smoothing problems, where the weights are modeled as Gaussian random variables. This allows closed-form expressions for training the network's parameters in a sequential/online fashion without gradient descent. We demonstrate our method on several UCI datasets and compare it to the state of the art.

LGAug 27, 2021
Reinforcement Learning based Condition-oriented Maintenance Scheduling for Flow Line Systems

Raphael Lamprecht, Ferdinand Wurst, Marco F. Huber

Maintenance scheduling is a complex decision-making problem in the production domain, where a number of maintenance tasks and resources has to be assigned and scheduled to production entities in order to prevent unplanned production downtime. Intelligent maintenance strategies are required that are able to adapt to the dynamics and different conditions of production systems. The paper introduces a deep reinforcement learning approach for condition-oriented maintenance scheduling in flow line systems. Different policies are learned, analyzed and evaluated against a benchmark scheduling heuristic based on reward modelling. The evaluation of the learned policies shows that reinforcement learning based maintenance strategies meet the requirements of the presented use case and are suitable for maintenance scheduling in the shop floor.

CVJul 1, 2021
Towards Measuring Bias in Image Classification

Nina Schaaf, Omar de Mitri, Hang Beom Kim et al.

Convolutional Neural Networks (CNN) have become de fact state-of-the-art for the main computer vision tasks. However, due to the complex underlying structure their decisions are hard to understand which limits their use in some context of the industrial world. A common and hard to detect challenge in machine learning (ML) tasks is data bias. In this work, we present a systematic approach to uncover data bias by means of attribution maps. For this purpose, first an artificial dataset with a known bias is created and used to train intentionally biased CNNs. The networks' decisions are then inspected using attribution maps. Finally, meaningful metrics are used to measure the attribution maps' representativeness with respect to the known bias. The proposed study shows that some attribution map techniques highlight the presence of bias in the data better than others and metrics can support the identification of bias.

SPMay 30, 2021
DimRad: A Radar-Based Perception System for Prosthetic Leg Barrier Traversing

Fady Aziz, Bassam Elmakhzangy, Christophe Maufroy et al.

Lower extremity amputees face challenges in natural locomotion, which is partially compensated using powered assistive systems, e.g., micro-processor controlled prosthetic leg. In this paper, a radar-based perception system is proposed to assist prosthetic legs for autonomous obstacle traversing, focusing on multiple-step staircases. The presented perception system is composed of a radar module operating with a multiple-input-multiple-output (MIMO) configuration to localize consecutive stair corners. An inertial measurement unit (IMU) is integrated for coordinates correction due to the angular dis-positioning that occurs because of the knee angular motion. The captured information from both sensors is used for staircase dimensioning (depth and height). A shallow neural network (NN) is proposed to model the error due to the hardware limitations and enhance the dimension estimation accuracy (1 cm). The algorithm is implemented on a microcontroller subsystem of the radar kit to qualify the perception system for embedded integration in powered prosthetic legs.

ROApr 23, 2021
Automatic Grasp Pose Generation for Parallel Jaw Grippers

Kilian Kleeberger, Florian Roth, Richard Bormann et al.

This paper presents a novel approach for the automatic offline grasp pose synthesis on known rigid objects for parallel jaw grippers. We use several criteria such as gripper stroke, surface friction, and a collision check to determine suitable 6D grasp poses on an object. In contrast to most available approaches, we neither aim for the best grasp pose nor for as many grasp poses as possible, but for a highly diverse set of grasps distributed all along the object. In order to accomplish this objective, we employ a clustering algorithm to the sampled set of grasps. This allows to simultaneously reduce the set of grasp pose candidates and maintain a high variance in terms of position and orientation between the individual grasps. We demonstrate that the grasps generated by our method can be successfully used in real-world robotic grasping applications.

CVApr 15, 2021
Investigations on Output Parameterizations of Neural Networks for Single Shot 6D Object Pose Estimation

Kilian Kleeberger, Markus Völk, Richard Bormann et al.

Single shot approaches have demonstrated tremendous success on various computer vision tasks. Finding good parameterizations for 6D object pose estimation remains an open challenge. In this work, we propose different novel parameterizations for the output of the neural network for single shot 6D object pose estimation. Our learning-based approach achieves state-of-the-art performance on two public benchmark datasets. Furthermore, we demonstrate that the pose estimates can be used for real-world robotic grasping tasks without additional ICP refinement.

LGJan 26, 2021
Incremental Search Space Construction for Machine Learning Pipeline Synthesis

Marc-André Zöller, Tien-Dung Nguyen, Marco F. Huber

Automated machine learning (AutoML) aims for constructing machine learning (ML) pipelines automatically. Many studies have investigated efficient methods for algorithm selection and hyperparameter optimization. However, methods for ML pipeline synthesis and optimization considering the impact of complex pipeline structures containing multiple preprocessing and classification algorithms have not been studied thoroughly. In this paper, we propose a data-centric approach based on meta-features for pipeline construction and hyperparameter optimization inspired by human behavior. By expanding the pipeline search space incrementally in combination with meta-features of intermediate data sets, we are able to prune the pipeline structure search space efficiently. Consequently, flexible and data set specific ML pipelines can be constructed. We prove the effectiveness and competitiveness of our approach on 28 data sets used in well-established AutoML benchmarks in comparison with state-of-the-art AutoML frameworks.

ROJan 12, 2021
Transferring Experience from Simulation to the Real World for Precise Pick-And-Place Tasks in Highly Cluttered Scenes

Kilian Kleeberger, Markus Völk, Marius Moosmann et al.

In this paper, we introduce a novel learning-based approach for grasping known rigid objects in highly cluttered scenes and precisely placing them based on depth images. Our Placement Quality Network (PQ-Net) estimates the object pose and the quality for each automatically generated grasp pose for multiple objects simultaneously at 92 fps in a single forward pass of a neural network. All grasping and placement trials are executed in a physics simulation and the gained experience is transferred to the real world using domain randomization. We demonstrate that our policy successfully transfers to the real world. PQ-Net outperforms other model-free approaches in terms of grasping success rate and automatically scales to new objects of arbitrary symmetry without any human intervention.

LGNov 16, 2020
A Survey on the Explainability of Supervised Machine Learning

Nadia Burkart, Marco F. Huber

Predictions obtained by, e.g., artificial neural networks have a high accuracy but humans often perceive the models as black boxes. Insights about the decision making are mostly opaque for humans. Particularly understanding the decision making in highly sensitive areas such as healthcare or fifinance, is of paramount importance. The decision-making behind the black boxes requires it to be more transparent, accountable, and understandable for humans. This survey paper provides essential definitions, an overview of the different principles and methodologies of explainable Supervised Machine Learning (SML). We conduct a state-of-the-art survey that reviews past and recent explainable SML approaches and classifies them according to the introduced definitions. Finally, we illustrate principles by means of an explanatory case study and discuss important future directions.

MLSep 3, 2020
Bayesian Perceptron: Towards fully Bayesian Neural Networks

Marco F. Huber

Artificial neural networks (NNs) have become the de facto standard in machine learning. They allow learning highly nonlinear transformations in a plethora of applications. However, NNs usually only provide point estimates without systematically quantifying corresponding uncertainties. In this paper a novel approach towards fully Bayesian NNs is proposed, where training and predictions of a perceptron are performed within the Bayesian inference framework in closed-form. The weights and the predictions of the perceptron are considered Gaussian random variables. Analytical expressions for predicting the perceptron's output and for learning the weights are provided for commonly used activation functions like sigmoid or ReLU. This approach requires no computationally expensive gradient calculations and further allows sequential learning.

CVApr 27, 2020
Single Shot 6D Object Pose Estimation

Kilian Kleeberger, Marco F. Huber

In this paper, we introduce a novel single shot approach for 6D object pose estimation of rigid objects based on depth images. For this purpose, a fully convolutional neural network is employed, where the 3D input data is spatially discretized and pose estimation is considered as a regression task that is solved locally on the resulting volume elements. With 65 fps on a GPU, our Object Pose Network (OP-Net) is extremely fast, is optimized end-to-end, and estimates the 6D pose of multiple objects in the image simultaneously. Our approach does not require manually 6D pose-annotated real-world datasets and transfers to the real world, although being entirely trained on synthetic data. The proposed method is evaluated on public benchmark datasets, where we can demonstrate that state-of-the-art methods are significantly outperformed.

CVDec 6, 2019
Large-scale 6D Object Pose Estimation Dataset for Industrial Bin-Picking

Kilian Kleeberger, Christian Landgraf, Marco F. Huber

In this paper, we introduce a new public dataset for 6D object pose estimation and instance segmentation for industrial bin-picking. The dataset comprises both synthetic and real-world scenes. For both, point clouds, depth images, and annotations comprising the 6D pose (position and orientation), a visibility score, and a segmentation mask for each object are provided. Along with the raw data, a method for precisely annotating real-world scenes is proposed. To the best of our knowledge, this is the first public dataset for 6D object pose estimation and instance segmentation for bin-picking containing sufficiently annotated data for learning-based approaches. Furthermore, it is one of the largest public datasets for object pose estimation in general. The dataset is publicly available at http://www.bin-picking.ai/en/dataset.html.

CVSep 8, 2019
Deep Workpiece Region Segmentation for Bin Picking

Muhammad Usman Khalid, Janik M. Hager, Werner Kraus et al.

For most industrial bin picking solutions, the pose of a workpiece is localized by matching a CAD model to point cloud obtained from 3D sensor. Distinguishing flat workpieces from bottom of the bin in point cloud imposes challenges in the localization of workpieces that lead to wrong or phantom detections. In this paper, we propose a framework that solves this problem by automatically segmenting workpiece regions from non-workpiece regions in a point cloud data. It is done in real time by applying a fully convolutional neural network trained on both simulated and real data. The real data has been labelled by our novel technique which automatically generates ground truth labels for real point clouds. Along with real time workpiece segmentation, our framework also helps in improving the number of detected workpieces and estimating the correct object poses. Moreover, it decreases the computation time by approximately 1s due to a reduction of the search space for the object pose estimation.

LGApr 26, 2019
Benchmark and Survey of Automated Machine Learning Frameworks

Marc-André Zöller, Marco F. Huber

Machine learning (ML) has become a vital part in many aspects of our daily life. However, building well performing machine learning applications requires highly specialized data scientists and domain experts. Automated machine learning (AutoML) aims to reduce the demand for data scientists by enabling domain experts to build machine learning applications automatically without extensive knowledge of statistics and machine learning. This paper is a combination of a survey on current AutoML methods and a benchmark of popular AutoML frameworks on real data sets. Driven by the selected frameworks for evaluation, we summarize and review important AutoML techniques and methods concerning every step in building an ML pipeline. The selected AutoML frameworks are evaluated on 137 data sets from established AutoML benchmark suits.

LGApr 10, 2019
Enhancing Decision Tree based Interpretation of Deep Neural Networks through L1-Orthogonal Regularization

Nina Schaaf, Marco F. Huber, Johannes Maucher

One obstacle that so far prevents the introduction of machine learning models primarily in critical areas is the lack of explainability. In this work, a practicable approach of gaining explainability of deep artificial neural networks (NN) using an interpretable surrogate model based on decision trees is presented. Simply fitting a decision tree to a trained NN usually leads to unsatisfactory results in terms of accuracy and fidelity. Using L1-orthogonal regularization during training, however, preserves the accuracy of the NN, while it can be closely approximated by small decision trees. Tests with different data sets confirm that L1-orthogonal regularization yields models of lower complexity and at the same time higher fidelity compared to other regularizers.

SYMar 28, 2012
Optimal Pruning for Multi-Step Sensor Scheduling

Marco F. Huber

In the considered linear Gaussian sensor scheduling problem, only one sensor out of a set of sensors performs a measurement. To minimize the estimation error over multiple time steps in a computationally tractable fashion, the so-called information-based pruning algorithm is proposed. It utilizes the information matrices of the sensors and the monotonicity of the Riccati equation. This allows ordering sensors according to their information contribution and excluding many of them from scheduling. Additionally, a tight lower is calculated for branch-and-bound search, which further improves the pruning performance.

SYMar 20, 2012
Robust Filtering and Smoothing with Gaussian Processes

Marc Peter Deisenroth, Ryan Turner, Marco F. Huber et al.

We propose a principled algorithm for robust Bayesian filtering and smoothing in nonlinear stochastic dynamic systems when both the transition function and the measurement function are described by non-parametric Gaussian process (GP) models. GPs are gaining increasing importance in signal processing, machine learning, robotics, and control for representing unknown system functions by posterior probability distributions. This modern way of "system identification" is more robust than finding point estimates of a parametric function representation. In this article, we present a principled algorithm for robust analytic smoothing in GP dynamic systems, which are increasingly used in robotics and control. Our numerical evaluations demonstrate the robustness of the proposed approach in situations where other state-of-the-art Gaussian filters and smoothers can fail.