Julian Kunkel

DC
h-index21
7papers
16citations
Novelty31%
AI Score44

7 Papers

31.1DCMay 25
An Empirical Evaluation of Quantum-Inspired QUBO Methods for Heterogeneous HPC Workflow Mapping and Scheduling

Aasish Kumar Sharma, Christian Boehme, Julian Kunkel

Heterogeneous HPC workflow scheduling under multiple hard constraints poses a challenging combinatorial optimization problem. Classical exact solvers guarantee optimality but face scalability limits, motivating interest in quantum-inspired Quadratic Unconstrained Binary Optimization (QUBO) as an alternative optimization paradigm. This work presents a systematic empirical evaluation of QUBO-based scheduling methods against classical baselines including MILP, CP-SAT, GA, and HEFT. We evaluate three QUBO variants, single-run simulated annealing, multi-attempt annealing, and a layered QAOA-inspired schedule, with hybrid enhancement strategies on validation workflows (3-4 tasks) and synthetic scaling instances (5-20 tasks). All solvers are assessed through a unified pipeline tracking feasibility, makespan, and resource utilization under progressive constraint activation and controlled penalty sweeps. All approaches recover the expected optimal makespan on validation instances, confirming formulation correctness. However, feasibility degradation emerges for specific QUBO variants as constraint interactions intensify, particularly when communication costs are introduced. Penalty analysis reveals a sharp feasibility threshold for QUBO-SA, where insufficient penalties consistently fail and moderate-to-strong penalties restore feasibility. Scaling experiments show that classical solvers remain robust across all tested sizes, while QUBO-SA loses feasibility beyond 15 tasks and the QAOA-inspired variant beyond 10 tasks. The study provides a clear empirical characterization of the reliability boundaries of quantum-inspired QUBO formulations for HPC scheduling and identifies regimes where classical approaches remain preferable under current solver capabilities.

17.1DCMar 23
Interactive and Urgent HPC: State of the Research

Albert Reuther, William Arndt, Johannes Blaschke et al.

When we think of how we use smartphones, e-commerce, collaboration platforms, LLMs, etc., most of our interactions with computers are interactive and often urgent. Similar trends of interactivity and urgency are coming to HPC, with applications from simulations to data analysis and machine learning requiring more parallel computational capability and more interactivity. This chapter overviews the progress made so far along with some vectors of what the path forward will bring for greater integration of interactive and urgent HPC policies, techniques, and technologies into our HPC ecosystems.

34.4DCMay 4Code
A Treasure Trove of Performance: Analyzing the IO500 Submission Data

Julian Kunkel, Aasish Kumar Sharma, Anila Ghazanfar et al.

The IO500 benchmark has become the community standard for evaluating HPC storage system performance, yet the detailed data contained in its submission packages remains largely unexplored beyond aggregate leaderboard rankings. We present a statistical characterization of 61 IO500 submissions from four competition lists (ISC21 through SC22), examining score distributions, inter-phase correlations, and insights derived from detailed log files that accompany each submission. Our analysis reveals that IO500 scores span four orders of magnitude. Spearman correlation analysis shows strong within-domain clustering for both bandwidth (rs = 0.78 to 0.96) and metadata (rs = 0.89 to 0.98) phases, with the composite sub-scores exhibiting rs = 0.92 at per-node level (Pearson r = 0.53). Log-level analysis uncovers file-system-specific patterns in IOR close-time overhead, straggler behavior during the stonewall wear-down phase, and parallel-find load imbalance that are invisible in aggregate scores. These findings demonstrate that IO500 submission packages constitute a valuable research resource for understanding storage system behavior. The full submission dataset is publicly available at https://github.com/IO500/submission-data, and analysis scripts at https://gitlab-ce.gwdg.de/hpc-team/io500-analysis.

DCNov 4, 2025
Evaluating Large Language Models for Workload Mapping and Scheduling in Heterogeneous HPC Systems

Aasish Kumar Sharma, Julian Kunkel

Large language models (LLMs) are increasingly explored for their reasoning capabilities, yet their ability to perform structured, constraint-based optimization from natural language remains insufficiently understood. This study evaluates twenty-one publicly available LLMs on a representative heterogeneous high-performance computing (HPC) workload mapping and scheduling problem. Each model received the same textual description of system nodes, task requirements, and scheduling constraints, and was required to assign tasks to nodes, compute the total makespan, and explain its reasoning. A manually derived analytical optimum of nine hours and twenty seconds served as the ground truth reference. Three models exactly reproduced the analytical optimum while satisfying all constraints, twelve achieved near-optimal results within two minutes of the reference, and six produced suboptimal schedules with arithmetic or dependency errors. All models generated feasible task-to-node mappings, though only about half maintained strict constraint adherence. Nineteen models produced partially executable verification code, and eighteen provided coherent step-by-step reasoning, demonstrating strong interpretability even when logical errors occurred. Overall, the results define the current capability boundary of LLM reasoning in combinatorial optimization: leading models can reconstruct optimal schedules directly from natural language, but most still struggle with precise timing, data transfer arithmetic, and dependency enforcement. These findings highlight the potential of LLMs as explainable co-pilots for optimization and decision-support tasks rather than autonomous solvers.

LGNov 22, 2024
AdamZ: An Enhanced Optimisation Method for Neural Network Training

Ilia Zaznov, Atta Badii, Alfonso Dufour et al.

AdamZ is an advanced variant of the Adam optimiser, developed to enhance convergence efficiency in neural network training. This optimiser dynamically adjusts the learning rate by incorporating mechanisms to address overshooting and stagnation, that are common challenges in optimisation. Specifically, AdamZ reduces the learning rate when overshooting is detected and increases it during periods of stagnation, utilising hyperparameters such as overshoot and stagnation factors, thresholds, and patience levels to guide these adjustments. While AdamZ may lead to slightly longer training times compared to some other optimisers, it consistently excels in minimising the loss function, making it particularly advantageous for applications where precision is critical. Benchmarking results demonstrate the effectiveness of AdamZ in maintaining optimal learning rates, leading to improved model performance across diverse tasks.

AIMay 30, 2025
Ethical AI: Towards Defining a Collective Evaluation Framework

Aasish Kumar Sharma, Dimitar Kyosev, Julian Kunkel

Artificial Intelligence (AI) is transforming sectors such as healthcare, finance, and autonomous systems, offering powerful tools for innovation. Yet its rapid integration raises urgent ethical concerns related to data ownership, privacy, and systemic bias. Issues like opaque decision-making, misleading outputs, and unfair treatment in high-stakes domains underscore the need for transparent and accountable AI systems. This article addresses these challenges by proposing a modular ethical assessment framework built on ontological blocks of meaning-discrete, interpretable units that encode ethical principles such as fairness, accountability, and ownership. By integrating these blocks with FAIR (Findable, Accessible, Interoperable, Reusable) principles, the framework supports scalable, transparent, and legally aligned ethical evaluations, including compliance with the EU AI Act. Using a real-world use case in AI-powered investor profiling, the paper demonstrates how the framework enables dynamic, behavior-informed risk classification. The findings suggest that ontological blocks offer a promising path toward explainable and auditable AI ethics, though challenges remain in automation and probabilistic reasoning.

DCMay 4, 2023
DECICE: Device-Edge-Cloud Intelligent Collaboration Framework

Julian Kunkel, Christian Boehme, Jonathan Decker et al.

DECICE is a Horizon Europe project that is developing an AI-enabled open and portable management framework for automatic and adaptive optimization and deployment of applications in computing continuum encompassing from IoT sensors on the Edge to large-scale Cloud / HPC computing infrastructures. In this paper, we describe the DECICE framework and architecture. Furthermore, we highlight use-cases for framework evaluation: intelligent traffic intersection, magnetic resonance imaging, and emergency response.