Markus Reischl

h-index42

18papers

206citations

Novelty26%

AI Score45

Ranked #41,303 of 194,257 authors (top 21%)#14,552 in CV (top 25%)

18 Papers

19.5CVMar 11, 2023Code

CoNIC Challenge: Pushing the Frontiers of Nuclear Detection, Segmentation, Classification and Counting

Simon Graham, Quoc Dang Vu, Mostafa Jahanifar et al.

Nuclear detection, segmentation and morphometric profiling are essential in helping us further understand the relationship between histology and patient outcome. To drive innovation in this area, we setup a community-wide challenge using the largest available dataset of its kind to assess nuclear segmentation and cellular composition. Our challenge, named CoNIC, stimulated the development of reproducible algorithms for cellular recognition with real-time result inspection on public leaderboards. We conducted an extensive post-challenge analysis based on the top-performing models using 1,658 whole-slide images of colon tissue. With around 700 million detected nuclei per model, associated features were used for dysplasia grading and survival analysis, where we demonstrated that the challenge's improvement over the previous state-of-the-art led to significant boosts in downstream performance. Our findings also suggest that eosinophils and neutrophils play an important role in the tumour microevironment. We release challenge models and WSI-level results to foster the development of further methods for biomarker discovery.

7.7LGSep 27, 2023

MLOps for Scarce Image Data: A Use Case in Microscopic Image Analysis

Angelo Yamachui Sitcheu, Nils Friederich, Simon Baeuerle et al.

Nowadays, Machine Learning (ML) is experiencing tremendous popularity that has never been seen before. The operationalization of ML models is governed by a set of concepts and methods referred to as Machine Learning Operations (MLOps). Nevertheless, researchers, as well as professionals, often focus more on the automation aspect and neglect the continuous deployment and monitoring aspects of MLOps. As a result, there is a lack of continuous learning through the flow of feedback from production to development, causing unexpected model deterioration over time due to concept drifts, particularly when dealing with scarce data. This work explores the complete application of MLOps in the context of scarce data analysis. The paper proposes a new holistic approach to enhance biomedical image analysis. Our method includes: a fingerprinting process that enables selecting the best models, datasets, and model development strategy relative to the image analysis task at hand; an automated model development stage; and a continuous deployment and monitoring process to ensure continuous learning. For preliminary results, we perform a proof of concept for fingerprinting in microscopic image datasets.

3.3LGNov 26, 2022Code

EasyMLServe: Easy Deployment of REST Machine Learning Services

Oliver Neumann, Marcel Schilling, Markus Reischl et al.

Various research domains use machine learning approaches because they can solve complex tasks by learning from data. Deploying machine learning models, however, is not trivial and developers have to implement complete solutions which are often installed locally and include Graphical User Interfaces (GUIs). Distributing software to various users on-site has several problems. Therefore, we propose a concept to deploy software in the cloud. There are several frameworks available based on Representational State Transfer (REST) which can be used to implement cloud-based machine learning services. However, machine learning services for scientific users have special requirements that state-of-the-art REST frameworks do not cover completely. We contribute an EasyMLServe software framework to deploy machine learning services in the cloud using REST interfaces and generic local or web-based GUIs. Furthermore, we apply our framework on two real-world applications, \ie, energy time-series forecasting and cell instance segmentation. The EasyMLServe framework and the use cases are available on GitHub.

5.8LGJun 13, 2022Code

A universal synthetic dataset for machine learning on spectroscopic data

Jan Schuetzke, Nathan J. Szymanski, Markus Reischl

To assist in the development of machine learning methods for automated classification of spectroscopic data, we have generated a universal synthetic dataset that can be used for model validation. This dataset contains artificial spectra designed to represent experimental measurements from techniques including X-ray diffraction, nuclear magnetic resonance, and Raman spectroscopy. The dataset generation process features customizable parameters, such as scan length and peak count, which can be adjusted to fit the problem at hand. As an initial benchmark, we simulated a dataset containing 35,000 spectra based on 500 unique classes. To automate the classification of this data, eight different machine learning architectures were evaluated. From the results, we shed light on which factors are most critical to achieve optimal performance for the classification task. The scripts used to generate synthetic spectra, as well as our benchmark dataset and evaluation routines, are made publicly available to aid in the development of improved machine learning models for spectroscopic analysis.

4.5CLJun 3

SANE Schema-aware Natural-language Evaluation of Biological Data

Rolf Gattung, Martin Krueger, Markus Reischl

High-throughput microscopy generates large, structured datasets capturing cellular responses to pharmacological perturbations, but accessing these datasets typically requires SQL expertise. Large language models offer a natural-language alternative, yet their tendency to hallucinate raises concerns about result reliability . We present SANE Schema-Aware Natural-language Evaluation, a novel paradigm for domain-specific text-to-SQL evaluation: schema-grounded, automatically generated benchmarks tied to real and specific experimental structure. SANE makes evaluation more scalable, systematic, and reproducible. Using SANE, we evaluate a few-shot large language model and show that, under constrained schemas with structured prompting and guardrails, accurate query generation is achievable without any model training or fine-tuning. Most failures stem from ambiguous or underspecified inputs and manifest as overly cautious clarification requests or answers to queries that should first be disambiguated, rather than incorrect SQL generation. These results indicate that few-shot large language models can provide reliable database access in well-defined domains when combined with schema-aware prompting.

5.3CVApr 16

Data Synthesis Improves 3D Myotube Instance Segmentation

David Exler, Nils Friederich, Martin Krüger et al.

Myotubes are multinucleated muscle fibers serving as key model systems for studying muscle physiology, disease mechanisms, and drug responses. Mechanistic studies and drug screening thereby rely on quantitative morphological readouts such as diameter, length, and branching degree, which in turn require precise three-dimensional instance segmentation. Yet established pretrained biomedical segmentation models fail to generalize to this domain due to the absence of large annotated myotube datasets. We introduce a geometry-driven synthesis pipeline that models individual myotubes via polynomial centerlines, locally varying radii, branching structures, and ellipsoidal end caps derived from real microscopy observations. Synthetic volumes are rendered with realistic noise, optical artifacts, and CycleGAN-based Domain Adaptation (DA). A compact 3D U-Net with self-supervised encoder pretraining, trained exclusively on synthetic data, achieves a mean IPQ of 0.22 on real data, significantly outperforming three established zero-shot segmentation models, demonstrating that biophysics-driven synthesis enables effective instance segmentation in annotation-scarce biomedical domains.

13.8CLMay 17, 2024Code

Assessing Political Bias in Large Language Models

Luca Rettenberger, Markus Reischl, Mark Schutera

The assessment of bias within Large Language Models (LLMs) has emerged as a critical concern in the contemporary discourse surrounding Artificial Intelligence (AI) in the context of their potential impact on societal dynamics. Recognizing and considering political bias within LLM applications is especially important when closing in on the tipping point toward performative prediction. Then, being educated about potential effects and the societal behavior LLMs can drive at scale due to their interplay with human operators. In this way, the upcoming elections of the European Parliament will not remain unaffected by LLMs. We evaluate the political bias of the currently most popular open-source LLMs (instruct or assistant models) concerning political issues within the European Union (EU) from a German voter's perspective. To do so, we use the "Wahl-O-Mat," a voting advice application used in Germany. From the voting advice of the "Wahl-O-Mat" we quantize the degree of alignment of LLMs with German political parties. We show that larger models, such as Llama3-70B, tend to align more closely with left-leaning political parties, while smaller models often remain neutral, particularly when prompted in English. The central finding is that LLMs are similarly biased, with low variances in the alignment concerning a specific party. Our findings underline the importance of rigorously assessing and making bias transparent in LLMs to safeguard the integrity and trustworthiness of applications that employ the capabilities of performative prediction and the invisible hand of machine learning prediction and language generation.

8.5IVAug 29, 2024

Improving 3D deep learning segmentation with biophysically motivated cell synthesis

Roman Bruch, Mario Vitacolonna, Elina Nürnberg et al.

Biomedical research increasingly relies on 3D cell culture models and AI-based analysis can potentially facilitate a detailed and accurate feature extraction on a single-cell level. However, this requires for a precise segmentation of 3D cell datasets, which in turn demands high-quality ground truth for training. Manual annotation, the gold standard for ground truth data, is too time-consuming and thus not feasible for the generation of large 3D training datasets. To address this, we present a novel framework for generating 3D training data, which integrates biophysical modeling for realistic cell shape and alignment. Our approach allows the in silico generation of coherent membrane and nuclei signals, that enable the training of segmentation models utilizing both channels for improved performance. Furthermore, we present a new GAN training scheme that generates not only image data but also matching labels. Quantitative evaluation shows superior performance of biophysical motivated synthetic training data, even outperforming manual annotation and pretrained models. This underscores the potential of incorporating biophysical modeling for enhancing synthetic training data quality.

2.8CVFeb 17

Bayesian Optimization for Design Parameters of 3D Image Data Analysis

David Exler, Joaquin Eduardo Urrutia Gómez, Martin Krüger et al.

Deep learning-based segmentation and classification are crucial to large-scale biomedical imaging, particularly for 3D data, where manual analysis is impractical. Although many methods exist, selecting suitable models and tuning parameters remains a major bottleneck in practice. Hence, we introduce the 3D data Analysis Optimization Pipeline, a method designed to facilitate the design and parameterization of segmentation and classification using two Bayesian Optimization stages. First, the pipeline selects a segmentation model and optimizes postprocessing parameters using a domain-adapted syntactic benchmark dataset. To ensure a concise evaluation of segmentation performance, we introduce a segmentation quality metric that serves as the objective function. Second, the pipeline optimizes design choices of a classifier, such as encoder and classifier head architectures, incorporation of prior knowledge, and pretraining strategies. To reduce manual annotation effort, this stage includes an assisted class-annotation workflow that extracts predicted instances from the segmentation results and sequentially presents them to the operator, eliminating the need for manual tracking. In four case studies, the 3D data Analysis Optimization Pipeline efficiently identifies effective model and parameter configurations for individual datasets.

4.8IVFeb 25, 2022

ciscNet -- A Single-Branch Cell Instance Segmentation and Classification Network

Moritz Böhland, Oliver Neumann, Marcel P. Schilling et al.

Automated cell nucleus segmentation and classification are required to assist pathologists in their decision making. The Colon Nuclei Identification and Counting Challenge 2022 (CoNIC Challenge 2022) supports the development and comparability of segmentation and classification methods for histopathological images. In this contribution, we describe our CoNIC Challenge 2022 method ciscNet to segment, classify and count cell nuclei, and report preliminary evaluation results. Our code is available at https://git.scc.kit.edu/ciscnet/ciscnet-conic-2022.

3.2LGApr 11, 2017Code

The MATLAB Toolbox SciXMiner: User's Manual and Programmer's Guide

Ralf Mikut, Andreas Bartschat, Wolfgang Doneit et al.

The Matlab toolbox SciXMiner is designed for the visualization and analysis of time series and features with a special focus to classification problems. It was developed at the Institute of Applied Computer Science of the Karlsruhe Institute of Technology (KIT), a member of the Helmholtz Association of German Research Centres in Germany. The aim was to provide an open platform for the development and improvement of data mining methods and its applications to various medical and technical problems. SciXMiner bases on Matlab (tested for the version 2017a). Many functions do not require additional standard toolboxes but some parts of Signal, Statistics and Wavelet toolboxes are used for special cases. The decision to a Matlab-based solution was made to use the wide mathematical functionality of this package provided by The Mathworks Inc. SciXMiner is controlled by a graphical user interface (GUI) with menu items and control elements like popup lists, checkboxes and edit elements. This makes it easier to work with SciXMiner for inexperienced users. Furthermore, an automatization and batch standardization of analyzes is possible using macros. The standard Matlab style using the command line is also available. SciXMiner is an open source software. The download page is http://sourceforge.net/projects/SciXMiner. It is licensed under the conditions of the GNU General Public License (GNU-GPL) of The Free Software Foundation.

12.0CLMay 7, 2025

Large Means Left: Political Bias in Large Language Models Increases with Their Number of Parameters

David Exler, Mark Schutera, Markus Reischl et al.

With the increasing prevalence of artificial intelligence, careful evaluation of inherent biases needs to be conducted to form the basis for alleviating the effects these predispositions can have on users. Large language models (LLMs) are predominantly used by many as a primary source of information for various topics. LLMs frequently make factual errors, fabricate data (hallucinations), or present biases, exposing users to misinformation and influencing opinions. Educating users on their risks is key to responsible use, as bias, unlike hallucinations, cannot be caught through data verification. We quantify the political bias of popular LLMs in the context of the recent vote of the German Bundestag using the score produced by the Wahl-O-Mat. This metric measures the alignment between an individual's political views and the positions of German political parties. We compare the models' alignment scores to identify factors influencing their political preferences. Doing so, we discover a bias toward left-leaning parties, most dominant in larger LLMs. Also, we find that the language we use to communicate with the models affects their political views. Additionally, we analyze the influence of a model's origin and release date and compare the results to the outcome of the recent vote of the Bundestag. Our results imply that LLMs are prone to exhibiting political bias. Large corporations with the necessary means to develop LLMs, thus, knowingly or unknowingly, have a responsibility to contain these biases, as they can influence each voter's decision-making process and inform public opinion in general and at scale.

6.3IVNov 30, 2024Code

Energy-Based Prior Latent Space Diffusion model for Reconstruction of Lumbar Vertebrae from Thick Slice MRI

Yanke Wang, Yolanne Y. R. Lee, Aurelio Dolfini et al.

Lumbar spine problems are ubiquitous, motivating research into targeted imaging for treatment planning and guided interventions. While high resolution and high contrast CT has been the modality of choice, MRI can capture both bone and soft tissue without the ionizing radiation of CT albeit longer acquisition time. The critical trade-off between contrast quality and acquisition time has motivated 'thick slice MRI', which prioritises faster imaging with high in-plane resolution but variable contrast and low through-plane resolution. We investigate a recently developed post-acquisition pipeline which segments vertebrae from thick-slice acquisitions and uses a variational autoencoder to enhance quality after an initial 3D reconstruction. We instead propose a latent space diffusion energy-based prior to leverage diffusion models, which exhibit high-quality image generation. Crucially, we mitigate their high computational cost and low sample efficiency by learning an energy-based latent representation to perform the diffusion processes. Our resulting method outperforms existing approaches across metrics including Dice and VS scores, and more faithfully captures 3D features.

3.7CVNov 27, 2021

Label Assistant: A Workflow for Assisted Data Annotation in Image Segmentation Tasks

Marcel P. Schilling, Luca Rettenberger, Friedrich Münke et al.

Recent research in the field of computer vision strongly focuses on deep learning architectures to tackle image processing problems. Deep neural networks are often considered in complex image processing scenarios since traditional computer vision approaches are expensive to develop or reach their limits due to complex relations. However, a common criticism is the need for large annotated datasets to determine robust parameters. Annotating images by human experts is time-consuming, burdensome, and expensive. Thus, support is needed to simplify annotation, increase user efficiency, and annotation quality. In this paper, we propose a generic workflow to assist the annotation process and discuss methods on an abstract level. Thereby, we review the possibilities of focusing on promising samples, image pre-processing, pre-labeling, label inspection, or post-processing of annotations. In addition, we present an implementation of the proposal by means of a developed flexible and extendable software prototype nested in hybrid touchscreen/laptop device.

0.9CVOct 11, 2019

Towards DeepSpray: Using Convolutional Neural Network to post-process Shadowgraphy Images of Liquid Atomization

Geoffroy Chaussonnet, Christian Lieber, Yan Yikang et al.

This technical report investigates the potential of Convolutional Neural Networks to post-process images from primary atomization. Three tasks are investigated. First, the detection and segmentation of liquid droplets in degraded optical conditions. Second, the detection of overlapping ellipses and the prediction of their geometrical characteristics. This task corresponds to extrapolate the hidden contour of an ellipse with reduced visual information. Third, several features of the liquid surface during primary breakup (ligaments, bags, rims) are manually annotated on 15 experimental images. The detector is trained on this minimal database using simple data augmentation and then applied to other images from numerical simulation and from other experiment. In these three tasks, models from the literature based on Convolutional Neural Networks showed very promising results, thus demonstrating the high potential of Deep Learning to post-process liquid atomization. The next step is to embed these models into a unified framework DeepSpray.

3.1AINov 27, 2018

Distributed traffic light control at uncoupled intersections with real-world topology by deep reinforcement learning

Mark Schutera, Niklas Goby, Stefan Smolarek et al.

This work examines the implications of uncoupled intersections with local real-world topology and sensor setup on traffic light control approaches. Control approaches are evaluated with respect to: Traffic flow, fuel consumption and noise emission at intersections. The real-world road network of Friedrichshafen is depicted, preprocessed and the present traffic light controlled intersections are modeled with respect to state space and action space. Different strategies, containing fixed-time, gap-based and time-based control approaches as well as our deep reinforcement learning based control approach, are implemented and assessed. Our novel DRL approach allows for modeling the TLC action space, with respect to phase selection as well as selection of transition timings. It was found that real-world topologies, and thus irregularly arranged intersections have an influence on the performance of traffic light control approaches. This is even to be observed within the same intersection types (n-arm, m-phases). Moreover we could show, that these influences can be efficiently dealt with by our deep reinforcement learning based control approach.

2.2LGOct 19, 2018

Transfer Learning versus Multi-agent Learning regarding Distributed Decision-Making in Highway Traffic

Mark Schutera, Niklas Goby, Dirk Neumann et al.

Transportation and traffic are currently undergoing a rapid increase in terms of both scale and complexity. At the same time, an increasing share of traffic participants are being transformed into agents driven or supported by artificial intelligence resulting in mixed-intelligence traffic. This work explores the implications of distributed decision-making in mixed-intelligence traffic. The investigations are carried out on the basis of an online-simulated highway scenario, namely the MIT \emph{DeepTraffic} simulation. In the first step traffic agents are trained by means of a deep reinforcement learning approach, being deployed inside an elitist evolutionary algorithm for hyperparameter search. The resulting architectures and training parameters are then utilized in order to either train a single autonomous traffic agent and transfer the learned weights onto a multi-agent scenario or else to conduct multi-agent learning directly. Both learning strategies are evaluated on different ratios of mixed-intelligence traffic. The strategies are assessed according to the average speed of all agents driven by artificial intelligence. Traffic patterns that provoke a reduction in traffic flow are analyzed with respect to the different strategies.

1.0MLJan 16, 2017

Datenqualität in Regressionsproblemen

Wolfgang Doneit, Ralf Mikut, Markus Reischl

Regression models are increasingly built using datasets which do not follow a design of experiment. Instead, the data is e.g. gathered by an automated monitoring of a technical system. As a consequence, already the input data represents phenomena of the system and violates statistical assumptions of distributions. The input data can show correlations, clusters or other patterns. Further, the distribution of input data influences the reliability of regression models. We propose criteria to quantify typical phenomena of input data for regression and show their suitability with simulated benchmark datasets. ----- Regressionen werden zunehmend auf Datensätzen angewendet, deren Eingangsvektoren nicht durch eine statistische Versuchsplanung festgelegt wurden. Stattdessen werden die Daten beispielsweise durch die passive Beobachtung technischer Systeme gesammelt. Damit bilden bereits die Eingangsdaten Phänomene des Systems ab und widersprechen statistischen Verteilungsannahmen. Die Verteilung der Eingangsdaten hat Einfluss auf die Zuverlässigkeit eines Regressionsmodells. Wir stellen deshalb Bewertungskriterien für einige typische Phänomene in Eingangsdaten von Regressionen vor und zeigen ihre Funktionalität anhand simulierter Benchmarkdatensätze.