Samuel Kounev

LG
h-index18
19papers
359citations
Novelty31%
AI Score40

19 Papers

LGSep 26, 2023
Telescope: An Automated Hybrid Forecasting Approach on a Level-Playing Field

André Bauer, Mark Leznik, Michael Stenger et al.

In many areas of decision-making, forecasting is an essential pillar. Consequently, many different forecasting methods have been proposed. From our experience, recently presented forecasting methods are computationally intensive, poorly automated, tailored to a particular data set, or they lack a predictable time-to-result. To this end, we introduce Telescope, a novel machine learning-based forecasting approach that automatically retrieves relevant information from a given time series and splits it into parts, handling each of them separately. In contrast to deep learning methods, our approach doesn't require parameterization or the need to train and fit a multitude of parameters. It operates with just one time series and provides forecasts within seconds without any additional setup. Our experiments show that Telescope outperforms recent methods by providing accurate and reliable forecasts while making no assumptions about the analyzed time series.

CVJul 10, 2024
Early Explorations of Lightweight Models for Wound Segmentation on Mobile Devices

Vanessa Borst, Timo Dittus, Konstantin Müller et al.

The aging population poses numerous challenges to healthcare, including the increase in chronic wounds in the elderly. The current approach to wound assessment by therapists based on photographic documentation is subjective, highlighting the need for computer-aided wound recognition from smartphone photos. This offers objective and convenient therapy monitoring, while being accessible to patients from their home at any time. However, despite research in mobile image segmentation, there is a lack of focus on mobile wound segmentation. To address this gap, we conduct initial research on three lightweight architectures to investigate their suitability for smartphone-based wound segmentation. Using public datasets and UNet as a baseline, our results are promising, with both ENet and TopFormer, as well as the larger UNeXt variant, showing comparable performance to UNet. Furthermore, we deploy the models into a smartphone app for visual assessment of live segmentation, where results demonstrate the effectiveness of TopFormer in distinguishing wounds from wound-coloured objects. While our study highlights the potential of transformer models for mobile wound segmentation, future work should aim to further improve the mask contours.

LGApr 26, 2025Code
TSRM: A Lightweight Temporal Feature Encoding Architecture for Time Series Forecasting and Imputation

Robert Leppich, Michael Stenger, Daniel Grillmeyer et al.

We introduce a temporal feature encoding architecture called Time Series Representation Model (TSRM) for multivariate time series forecasting and imputation. The architecture is structured around CNN-based representation layers, each dedicated to an independent representation learning task and designed to capture diverse temporal patterns, followed by an attention-based feature extraction layer and a merge layer, designed to aggregate extracted features. The architecture is fundamentally based on a configuration that is inspired by a Transformer encoder, with self-attention mechanisms at its core. The TSRM architecture outperforms state-of-the-art approaches on most of the seven established benchmark datasets considered in our empirical evaluation for both forecasting and imputation tasks. At the same time, it significantly reduces complexity in the form of learnable parameters. The source code is available at https://github.com/RobertLeppich/TSRM.

AIJul 8, 2025Code
Decomposing the Time Series Forecasting Pipeline: A Modular Approach for Time Series Representation, Information Extraction, and Projection

Robert Leppich, Michael Stenger, André Bauer et al.

With the advent of Transformers, time series forecasting has seen significant advances, yet it remains challenging due to the need for effective sequence representation, memory construction, and accurate target projection. Time series forecasting remains a challenging task, demanding effective sequence representation, meaningful information extraction, and precise future projection. Each dataset and forecasting configuration constitutes a distinct task, each posing unique challenges the model must overcome to produce accurate predictions. To systematically address these task-specific difficulties, this work decomposes the time series forecasting pipeline into three core stages: input sequence representation, information extraction and memory construction, and final target projection. Within each stage, we investigate a range of architectural configurations to assess the effectiveness of various modules, such as convolutional layers for feature extraction and self-attention mechanisms for information extraction, across diverse forecasting tasks, including evaluations on seven benchmark datasets. Our models achieve state-of-the-art forecasting accuracy while greatly enhancing computational efficiency, with reduced training and inference times and a lower parameter count. The source code is available at https://github.com/RobertLeppich/REP-Net.

CVMar 13
Are General-Purpose Vision Models All We Need for 2D Medical Image Segmentation? A Cross-Dataset Empirical Study

Vanessa Borst, Samuel Kounev

Medical image segmentation (MIS) is a fundamental component of computer-assisted diagnosis and clinical decision support systems. Over the past decade, numerous architectures specifically tailored to medical imaging have emerged to address domain-specific challenges such as low contrast, small anatomical structures, and limited annotated data. In parallel, rapid progress in computer vision has produced highly capable general-purpose vision models (GP-VMs) originally designed for natural images. Despite their strong performance on standard vision benchmarks, their effectiveness for MIS remains insufficiently understood. In this work, we conduct a controlled empirical study to examine whether specialized medical segmentation architectures (SMAs) provide systematic advantages over modern GP-VMs for 2D MIS. We compare eleven SMAs and GP-VMs using a unified training and evaluation protocol. Experiments are performed across three heterogeneous datasets covering different imaging modalities, class structures, and data characteristics. Beyond segmentation accuracy, we analyze qualitative Grad-CAM visualizations to investigate explainability (XAI) behavior. Our results demonstrate that, for the analyzed datasets, GP-VMs out-perform the majority of specialized MIS models. Moreover, XAI analyses indicate that GP-VMs can capture clinically relevant structures without explicit domain-specific architectural design. These findings suggest that GP-VMs can represent a viable alternative to domain-specific methods, highlighting the importance of informed model selection for end-to-end MIS systems. All code and resources are available at GitHub.

LGJan 4, 2024
Comprehensive Exploration of Synthetic Data Generation: A Survey

André Bauer, Simon Trapp, Michael Stenger et al.

Recent years have witnessed a surge in the popularity of Machine Learning (ML), applied across diverse domains. However, progress is impeded by the scarcity of training data due to expensive acquisition and privacy legislation. Synthetic data emerges as a solution, but the abundance of released models and limited overview literature pose challenges for decision-making. This work surveys 417 Synthetic Data Generation (SDG) models over the last decade, providing a comprehensive overview of model types, functionality, and improvements. Common attributes are identified, leading to a classification and trend analysis. The findings reveal increased model performance and complexity, with neural network-based approaches prevailing, except for privacy-preserving data generation. Computer vision dominates, with GANs as primary generative models, while diffusion models, transformers, and RNNs compete. Implications from our performance evaluation highlight the scarcity of common metrics and datasets, making comparisons challenging. Additionally, the neglect of training and computational costs in literature necessitates attention in future research. This work serves as a guide for SDG model selection and identifies crucial areas for future exploration.

CVApr 8, 2025
WoundAmbit: Bridging State-of-the-Art Semantic Segmentation and Real-World Wound Care

Vanessa Borst, Timo Dittus, Tassilo Dege et al.

Chronic wounds affect a large population, particularly the elderly and diabetic patients, who often exhibit limited mobility and co-existing health conditions. Automated wound monitoring via mobile image capture can reduce in-person physician visits by enabling remote tracking of wound size. Semantic segmentation is key to this process, yet wound segmentation remains underrepresented in medical imaging research. To address this, we benchmark state-of-the-art deep learning models from general-purpose vision, medical imaging, and top methods from public wound challenges. For a fair comparison, we standardize training, data augmentation, and evaluation, conducting cross-validation to minimize partitioning bias. We also assess real-world deployment aspects, including generalization to an out-of-distribution wound dataset, computational efficiency, and interpretability. Additionally, we propose a reference object-based approach to convert AI-generated masks into clinically relevant wound size estimates and evaluate this, along with mask quality, for the five best architectures based on physician assessments. Overall, the transformer-based TransNeXt showed the highest levels of generalizability. Despite variations in inference times, all models processed at least one image per second on the CPU, which is deemed adequate for the intended application. Interpretability analysis typically revealed prominent activations in wound regions, emphasizing focus on clinically relevant features. Expert evaluation showed high mask approval for all analyzed models, with VWFormer and ConvNeXtS backbone performing the best. Size retrieval accuracy was similar across models, and predictions closely matched expert annotations. Finally, we demonstrate how our AI-driven wound size estimation framework, WoundAmbit, is integrated into a custom telehealth system.

LGMay 27, 2025
STEB: In Search of the Best Evaluation Approach for Synthetic Time Series

Michael Stenger, Robert Leppich, André Bauer et al.

The growing need for synthetic time series, due to data augmentation or privacy regulations, has led to numerous generative models, frameworks, and evaluation measures alike. Objectively comparing these measures on a large scale remains an open challenge. We propose the Synthetic Time series Evaluation Benchmark (STEB) -- the first benchmark framework that enables comprehensive and interpretable automated comparisons of synthetic time series evaluation measures. Using 10 diverse datasets, randomness injection, and 13 configurable data transformations, STEB computes indicators for measure reliability and score consistency. It tracks running time, test errors, and features sequential and parallel modes of operation. In our experiments, we determine a ranking of 41 measures from literature and confirm that the choice of upstream time series embedding heavily impacts the final score.

NENov 23, 2021
A Case Study on Optimization of Warehouses

Veronika Lesch, Patrick B. M. Müller, Moritz Krämer et al.

In warehouses, order picking is known to be the most labor-intensive and costly task in which the employees account for a large part of the warehouse performance. Hence, many approaches exist, that optimize the order picking process based on diverse economic criteria. However, most of these approaches focus on a single economic objective at once and disregard ergonomic criteria in their optimization. Further, the influence of the placement of the items to be picked is underestimated and accordingly, too little attention is paid to the interdependence of these two problems. In this work, we aim at optimizing the storage assignment and the order picking problem within mezzanine warehouse with regards to their reciprocal influence. We propose a customized version of the Non-dominated Sorting Genetic Algorithm II (NSGA-II) for optimizing the storage assignment problem as well as an Ant Colony Optimization (ACO) algorithm for optimizing the order picking problem. Both algorithms incorporate multiple economic and ergonomic constraints simultaneously. Furthermore, the algorithms incorporate knowledge about the interdependence between both problems, aiming to improve the overall warehouse performance. Our evaluation results show that our proposed algorithms return better storage assignments and order pick routes compared to commonly used techniques for the following quality indicators for comparing Pareto fronts: Coverage, Generational Distance, Euclidian Distance, Pareto Front Size, and Inverted Generational Distance. Additionally, the evaluation regarding the interaction of both algorithms shows a better performance when combining both proposed algorithms.

SENov 18, 2021
A Case Study on Optimization of Platooning Coordination

Veronika Lesch, Marius Hadry, Samuel Kounev et al.

In today's world, circumstances, processes, and requirements for software systems are becoming increasingly complex. In order to operate properly in such dynamic environments, software systems must adapt to these changes, which has led to the research area of Self-Adaptive Systems (SAS). Platooning is one example of adaptive systems in Intelligent Transportation Systems, which is the ability of vehicles to travel with close inter-vehicle distances. This technology leads to an increase in road throughput and safety, which directly addresses the increased infrastructure needs due to increased traffic on the roads. However, the No-Free-Lunch theorem states that the performance of one platooning coordination strategy is not necessarily transferable to other problems. Moreover, especially in the field of SAS, the selection of the most appropriate strategy depends on the current situation of the system. In this paper, we address the problem of self-aware optimization of adaptation planning strategies by designing a framework that includes situation detection, strategy selection, and parameter optimization of the selected strategies. We apply our approach on the case study platooning coordination and evaluate the performance of the proposed framework.

NENov 17, 2021
A Case Study of Vehicle Route Optimization

Veronika Lesch, Maximilian König, Samuel Kounev et al.

In the last decades, the classical Vehicle Routing Problem (VRP), i.e., assigning a set of orders to vehicles and planning their routes has been intensively researched. As only the assignment of order to vehicles and their routes is already an NP-complete problem, the application of these algorithms in practice often fails to take into account the constraints and restrictions that apply in real-world applications, the so called rich VRP (rVRP) and are limited to single aspects. In this work, we incorporate the main relevant real-world constraints and requirements. We propose a two-stage strategy and a Timeline algorithm for time windows and pause times, and apply a Genetic Algorithm (GA) and Ant Colony Optimization (ACO) individually to the problem to find optimal solutions. Our evaluation of eight different problem instances against four state-of-the-art algorithms shows that our approach handles all given constraints in a reasonable time.

DCJul 28, 2021
A Case Study on the Stability of Performance Tests for Serverless Applications

Simon Eismann, Diego Elias Costa, Lizhi Liao et al.

Context. While in serverless computing, application resource management and operational concerns are generally delegated to the cloud provider, ensuring that serverless applications meet their performance requirements is still a responsibility of the developers. Performance testing is a commonly used performance assessment practice; however, it traditionally requires visibility of the resource environment. Objective. In this study, we investigate whether performance tests of serverless applications are stable, that is, if their results are reproducible, and what implications the serverless paradigm has for performance tests. Method. We conduct a case study where we collect two datasets of performance test results: (a) repetitions of performance tests for varying memory size and load intensities and (b) three repetitions of the same performance test every day for ten months. Results. We find that performance tests of serverless applications are comparatively stable if conducted on the same day. However, we also observe short-term performance variations and frequent long-term performance changes. Conclusion. Performance tests for serverless applications can be stable; however, the serverless model impacts the planning, execution, and analysis of performance tests.

DCOct 28, 2020
Sizeless: Predicting the optimal size of serverless functions

Simon Eismann, Long Bui, Johannes Grohmann et al.

Serverless functions are a cloud computing paradigm where the provider takes care of resource management tasks such as resource provisioning, deployment, and auto-scaling. The only resource management task that developers are still in charge of is selecting how much resources are allocated to each worker instance. However, selecting the optimal size of serverless functions is quite challenging, so developers often neglect it despite its significant cost and performance benefits. Existing approaches aiming to automate serverless functions resource sizing require dedicated performance tests, which are time-consuming to implement and maintain. In this paper, we introduce an approach to predict the optimal resource size of a serverless function using monitoring data from a single resource size. As our approach does not require dedicated performance tests, it enables cloud providers to implement resource sizing on a platform level and automate the last resource management task associated with serverless functions. We evaluate our approach on three different serverless applications, where it selects the optimal memory size for 71.7% of the serverless functions and the second-best memory size for 22.3% of the serverless functions, which results in an average speedup of 43.6% while simultaneously decreasing average costs by 10.2%.

NIMay 17, 2020
Attack-aware Security Function Chain Reordering

Lukas Iffländer, Nishant Rawtani, Lukas Beierlieb et al.

Attack-awareness recognizes self-awareness for security systems regarding the occurring attacks. More frequent and intense attacks on cloud and network infrastructures are pushing security systems to the limit. With the end of Moore's Law, merely scaling against these attacks is no longer economically justified. Previous works have already dealt with the adoption of Software-defined Networking and Network Function Virtualization in security systems and used both approaches to optimize performance by the intelligent placement of security functions. However, these works have not yet considered the sequence in which traffic passes through these functions. In this work, we make a case for the need to take this ordering into account by showing its impact. We then propose a reordering framework and analyze what aspects are necessary for modeling security service function chains and making decisions regarding the order based on those models. We show the impact of the order and validate our framework in an evaluation environment. The effect can extend to multiple orders of magnitude, and the framework's evaluation proves the feasibility of our concept.

LGFeb 5, 2020
A Survey on Predictive Maintenance for Industry 4.0

Christian Krupitzer, Tim Wagenhals, Marwin Züfle et al.

Production issues at Volkswagen in 2016 lead to dramatic losses in sales of up to 400 million Euros per week. This example shows the huge financial impact of a working production facility for companies. Especially in the data-driven domains of Industry 4.0 and Industrial IoT with intelligent, connected machines, a conventional, static maintenance schedule seems to be old-fashioned. In this paper, we present a survey on the current state of the art in predictive maintenance for Industry 4.0. Based on a structured literate survey, we present a classification of predictive maintenance in the context of Industry 4.0 and discuss recent developments in this area.

HCFeb 3, 2020
A Survey on Human Machine Interaction in Industry 4.0

Christian Krupitzer, Sebastian Müller, Veronika Lesch et al.

Industry 4.0 or Industrial IoT both describe new paradigms for seamless interaction between humans and machines. Both concepts rely on intelligent, inter-connected cyber-physical production systems that are able to control the process flow of industrial production. As those machines take many decisions autonomously and further interact with production and manufacturing planning systems, the integration of human users requires new paradigms. In this paper, we provide an analysis of the current state-of-the-art in human-machine interaction in the Industry 4.0 domain.We focus on new paradigms that integrate the application of augmented and virtual reality technology. Based on our analysis, we further provide a discussion of research challenges.

CRJul 15, 2019
Hands Off my Database: Ransomware Detection in Databases through Dynamic Analysis of Query Sequences

Lukas Iffländer, Alexandra Dmitrienko, Christoph Hagen et al.

Ransomware is an emerging threat which imposed a \$ 5 billion loss in 2017 and is predicted to hit \$ 11.5 billion in 2019. While initially targeting PC (client) platforms, ransomware recently made the leap to server-side databases - starting in January 2017 with the MongoDB Apocalypse attack, followed by other attack waves targeting a wide range of DB types such as MongoDB, MySQL, ElasticSearch, Cassandra, Hadoop, and CouchDB. While previous research has developed countermeasures against client-side ransomware (e.g., CryptoDrop and ShieldFS), the problem of server-side ransomware has received zero attention so far. In our work, we aim to bridge this gap and present DIMAQS (Dynamic Identification of Malicious Query Sequences), a novel anti-ransomware solution for databases. DIMAQS performs runtime monitoring of incoming queries and pattern matching using Colored Petri Nets (CPNs) for attack detection. Our system design exhibits several novel techniques to enable efficient detection of malicious query sequences globally (i.e., without limiting detection to distinct user connections). Our proof-of-concept implementation targets MySQL servers. The evaluation shows high efficiency with no false positives and no false negatives and very moderate performance overhead of under 5%. We will publish our data sets and implementation allowing the community to reproduce our tests and compare to our results.

CROct 5, 2014
On Benchmarking Intrusion Detection Systems in Virtualized Environments

Aleksandar Milenkoski, Samuel Kounev, Alberto Avritzer et al.

Modern intrusion detection systems (IDSes) for virtualized environments are deployed in the virtualization layer with components inside the virtual machine monitor (VMM) and the trusted host virtual machine (VM). Such IDSes can monitor at the same time the network and host activities of all guest VMs running on top of a VMM being isolated from malicious users of these VMs. We refer to IDSes for virtualized environments as VMM-based IDSes. In this work, we analyze state-of-the-art intrusion detection techniques applied in virtualized environments and architectures of VMM-based IDSes. Further, we identify challenges that apply specifically to benchmarking VMM-based IDSes focussing on workloads and metrics. For example, we discuss the challenge of defining representative baseline benign workload profiles as well as the challenge of defining malicious workloads containing attacks targeted at the VMM. We also discuss the impact of on-demand resource provisioning features of virtualized environments (e.g., CPU and memory hotplugging, memory ballooning) on IDS benchmarking measures such as capacity and attack detection accuracy. Finally, we outline future research directions in the area of benchmarking VMM-based IDSes and of intrusion detection in virtualized environments in general.

CROct 5, 2014
Technical Information on Vulnerabilities of Hypercall Handlers

Aleksandar Milenkoski, Marco Vieira, Bryan D. Payne et al.

Modern virtualized service infrastructures expose attack vectors that enable attacks of high severity, such as attacks targeting hypervisors. A malicious user of a guest VM (virtual machine) may execute an attack against the underlying hypervisor via hypercalls, which are software traps from a kernel of a fully or partially paravirtualized guest VM to the hypervisor. The exploitation of a vulnerability of a hypercall handler may have severe consequences such as altering hypervisor's memory, which may result in the execution of malicious code with hypervisor privilege. Despite the importance of vulnerabilities of hypercall handlers, there is not much publicly available information on them. This significantly hinders advances towards securing hypercall interfaces. In this work, we provide in-depth technical information on publicly disclosed vulnerabilities of hypercall handlers. Our vulnerability analysis is based on reverse engineering the released patches fixing the considered vulnerabilities. For each analyzed vulnerability, we provide background information essential for understanding the vulnerability, and information on the vulnerable hypercall handler and the error causing the vulnerability. We also show how the vulnerability can be triggered and discuss the state of the targeted hypervisor after the vulnerability has been triggered.