Julia Werner

CV
h-index6
8papers
13citations
Novelty46%
AI Score49

8 Papers

HCMay 30
MIA: A Visual Analytics System for Multimodal Spectral Imaging Data

Hennes Rave, Katharina Kronenberg, Hannes Gödde et al.

Hyperspectral bioimaging techniques such as infrared (IR) microscopy and laser ablation-inductively coupled plasma-mass spectrometry (LA-ICP-MS) produce high-dimensional, spatially resolved datasets that require sophisticated analysis to reveal chemically and anatomically meaningful structures. Existing software solutions are typically modality-specific and cover only parts of the analytical workflow, forcing researchers to transfer data across multiple tools and manually reconcile results. We present MIA (Multiscale Image Analysis), a modality-agnostic visual analysis environment that integrates the full exploratory workflow -- from spectral preprocessing and dimensionality reduction to interactive segmentation and spectral similarity analysis -- within a single, tightly coupled interface. MIA supports hierarchical and landmark-based embeddings to handle datasets of varying scale and complexity, interactive and automatic segmentation with a shared state across all linked views, and multimodal analysis of co-registered datasets from different instruments. We demonstrate the effectiveness of MIA through three use cases drawn from real analytical chemistry workflows: (1) the recovery of biologically meaningful tissue compartments through derivative preprocessing and hierarchical embedding, (2) pigment identification via spectral similarity search with spatial overview, and (3) multimodal tissue characterization combining molecular IR and elemental LA-ICP-MS data. Qualitative feedback from domain expert collaborators confirms that MIA reduces the need for tool-switching and supports analytical insights that are difficult to obtain with existing software.

LGOct 11, 2023
Precise localization within the GI tract by combining classification of CNNs and time-series analysis of HMMs

Julia Werner, Christoph Gerum, Moritz Reiber et al.

This paper presents a method to efficiently classify the gastroenterologic section of images derived from Video Capsule Endoscopy (VCE) studies by exploring the combination of a Convolutional Neural Network (CNN) for classification with the time-series analysis properties of a Hidden Markov Model (HMM). It is demonstrated that successive time-series analysis identifies and corrects errors in the CNN output. Our approach achieves an accuracy of $98.04\%$ on the Rhode Island (RI) Gastroenterology dataset. This allows for precise localization within the gastrointestinal (GI) tract while requiring only approximately 1M parameters and thus, provides a method suitable for low power devices

CVFeb 6
Reliable Mislabel Detection for Video Capsule Endoscopy Data

Julia Werner, Julius Oexle, Oliver Bause et al.

The classification performance of deep neural networks relies strongly on access to large, accurately annotated datasets. In medical imaging, however, obtaining such datasets is particularly challenging since annotations must be provided by specialized physicians, which severely limits the pool of annotators. Furthermore, class boundaries can often be ambiguous or difficult to define which further complicates machine learning-based classification. In this paper, we want to address this problem and introduce a framework for mislabel detection in medical datasets. This is validated on the two largest, publicly available datasets for Video Capsule Endoscopy, an important imaging procedure for examining the gastrointestinal tract based on a video stream of lowresolution images. In addition, potentially mislabeled samples identified by our pipeline were reviewed and re-annotated by three experienced gastroenterologists. Our results show that the proposed framework successfully detects incorrectly labeled data and results in an improved anomaly detection performance after cleaning the datasets compared to current baselines.

CVApr 28
Image Compression with Bubble-Aware Frame Rate Adaptation for Energy-Efficient Video Capsule Endoscopy

Oliver Bause, Jörg Gammerdinger, Julia Werner

Video Capsule Endoscopy (VCE) is a promising method for improving the medical examination of the small intestine in the gastrointestinal tract. A key challenge is their limited size, resulting in a short battery lifetime which conflicts with high energy consumption for image capturing and transmission to an on-body device. Thus, we propose an image compression pipeline that substantially reduces the transmitted data while preserving diagnostic image quality. Furthermore, we exploit characteristics of the compression process to identify frames with low diagnostic value mainly caused by bubbles, without requiring additional image analysis. For low-visibility frames, a dynamic bubble-aware frame rate adaptation strategy reduces image acquisition and transmission during these phases while preserving sensitivity to potential anomalies. The proposed compression and frame rate adaptation are evaluated on a RISC-V platform using the Kvasir-Capsule and Galar datasets. The compression method achieves a compression ratio of 5.748 (82.6%) at a peak signal-to-noise ratio of 40.3 dB, indicating negligible loss of visual quality. The compression accomplished a mean energy reduction of the whole system by 20.58%. Additionally, the proposed bubble-aware frame rate adaptation reduced the energy consumption by up to 40%. These results demonstrate the potential of our method to increase the applicability of VCE.

CVApr 8, 2025
Enhanced Anomaly Detection for Capsule Endoscopy Using Ensemble Learning Strategies

Julia Werner, Christoph Gerum, Jorg Nick et al.

Capsule endoscopy is a method to capture images of the gastrointestinal tract and screen for diseases which might remain hidden if investigated with standard endoscopes. Due to the limited size of a video capsule, embedding AI models directly into the capsule demands careful consideration of the model size and thus complicates anomaly detection in this field. Furthermore, the scarcity of available data in this domain poses an ongoing challenge to achieving effective anomaly detection. Thus, this work introduces an ensemble strategy to address this challenge in anomaly detection tasks in video capsule endoscopies, requiring only a small number of individual neural networks during both the training and inference phases. Ensemble learning combines the predictions of multiple independently trained neural networks. This has shown to be highly effective in enhancing both the accuracy and robustness of machine learning models. However, this comes at the cost of higher memory usage and increased computational effort, which quickly becomes prohibitive in many real-world applications. Instead of applying the same training algorithm to each individual network, we propose using various loss functions, drawn from the anomaly detection field, to train each network. The methods are validated on the two largest publicly available datasets for video capsule endoscopy images, the Galar and the Kvasir-Capsule dataset. We achieve an AUC score of 76.86% on the Kvasir-Capsule and an AUC score of 76.98% on the Galar dataset. Our approach outperforms current baselines with significantly fewer parameters across all models, which is a crucial step towards incorporating artificial intelligence into capsule endoscopies.

CVJul 31, 2025
Seeing More with Less: Video Capsule Endoscopy with Multi-Task Learning

Julia Werner, Oliver Bause, Julius Oexle et al.

Video capsule endoscopy has become increasingly important for investigating the small intestine within the gastrointestinal tract. However, a persistent challenge remains the short battery lifetime of such compact sensor edge devices. Integrating artificial intelligence can help overcome this limitation by enabling intelligent real-time decision-making, thereby reducing the energy consumption and prolonging the battery life. However, this remains challenging due to data sparsity and the limited resources of the device restricting the overall model size. In this work, we introduce a multi-task neural network that combines the functionalities of precise self-localization within the gastrointestinal tract with the ability to detect anomalies in the small intestine within a single model. Throughout the development process, we consistently restricted the total number of parameters to ensure the feasibility to deploy such model in a small capsule. We report the first multi-task results using the recently published Galar dataset, integrating established multi-task methods and Viterbi decoding for subsequent time-series analysis. This outperforms current single-task models and represents a significant advance in AI-based approaches in this field. Our model achieves an accuracy of 93.63% on the localization task and an accuracy of 87.48% on the anomaly detection task. The approach requires only 1 million parameters while surpassing the current baselines.

IVJul 31, 2025
Smart Video Capsule Endoscopy: Raw Image-Based Localization for Enhanced GI Tract Investigation

Oliver Bause, Julia Werner, Paul Palomero Bernardo et al.

For many real-world applications involving low-power sensor edge devices deep neural networks used for image classification might not be suitable. This is due to their typically large model size and require- ment of operations often exceeding the capabilities of such resource lim- ited devices. Furthermore, camera sensors usually capture images with a Bayer color filter applied, which are subsequently converted to RGB images that are commonly used for neural network training. However, on resource-constrained devices, such conversions demands their share of energy and optimally should be skipped if possible. This work ad- dresses the need for hardware-suitable AI targeting sensor edge devices by means of the Video Capsule Endoscopy, an important medical proce- dure for the investigation of the small intestine, which is strongly limited by its battery lifetime. Accurate organ classification is performed with a final accuracy of 93.06% evaluated directly on Bayer images involv- ing a CNN with only 63,000 parameters and time-series analysis in the form of Viterbi decoding. Finally, the process of capturing images with a camera and raw image processing is demonstrated with a customized PULPissimo System-on-Chip with a RISC-V core and an ultra-low power hardware accelerator providing an energy-efficient AI-based image clas- sification approach requiring just 5.31 μJ per image. As a result, it is possible to save an average of 89.9% of energy before entering the small intestine compared to classic video capsules.

SPJun 19, 2024
Energy-Efficient Seizure Detection Suitable for low-power Applications

Julia Werner, Bhavya Kohli, Paul Palomero Bernardo et al.

Epilepsy is the most common, chronic, neurological disease worldwide and is typically accompanied by reoccurring seizures. Neuro implants can be used for effective treatment by suppressing an upcoming seizure upon detection. Due to the restricted size and limited battery lifetime of those medical devices, the employed approach also needs to be limited in size and have low energy requirements. We present an energy-efficient seizure detection approach involving a TC-ResNet and time-series analysis which is suitable for low-power edge devices. The presented approach allows for accurate seizure detection without preceding feature extraction while considering the stringent hardware requirements of neural implants. The approach is validated using the CHB-MIT Scalp EEG Database with a 32-bit floating point model and a hardware suitable 4-bit fixed point model. The presented method achieves an accuracy of 95.28%, a sensitivity of 92.34% and an AUC score of 0.9384 on this dataset with 4-bit fixed point representation. Furthermore, the power consumption of the model is measured with the low-power AI accelerator UltraTrail, which only requires 495 nW on average. Due to this low-power consumption this classification approach is suitable for real-time seizure detection on low-power wearable devices such as neural implants.