Roman Bauer

CL
h-index10
4papers
12citations
Novelty25%
AI Score39

4 Papers

CLMar 15Code
Parameter-Efficient Quality Estimation via Frozen Recursive Models

Umar Abubacar, Roman Bauer, Diptesh Kanojia

Tiny Recursive Models (TRM) achieve strong results on reasoning tasks through iterative refinement of a shared network. We investigate whether these recursive mechanisms transfer to Quality Estimation (QE) for low-resource languages using a three-phase methodology. Experiments on $8$ language pairs on a low-resource QE dataset reveal three findings. First, TRM's recursive mechanisms do not transfer to QE. External iteration hurts performance, and internal recursion offers only narrow benefits. Next, representation quality dominates architectural choices, and lastly, frozen pretrained embeddings match fine-tuned performance while reducing trainable parameters by 37$\times$ (7M vs 262M). TRM-QE with frozen XLM-R embeddings achieves a Spearman's correlation of 0.370, matching fine-tuned variants (0.369) and outperforming an equivalent-depth standard transformer (0.336). On Hindi and Tamil, frozen TRM-QE outperforms MonoTransQuest (560M parameters) with 80$\times$ fewer trainable parameters, suggesting that weight sharing combined with frozen embeddings enables parameter efficiency for QE. We release the code publicly for further research. Code is available at https://github.com/surrey-nlp/TRMQE.

SEDec 9, 2025Code
Evolving Excellence: Automated Optimization of LLM-based Agents

Paul Brookes, Vardan Voskanyan, Rafail Giavrimis et al.

Agentic AI systems built on large language models (LLMs) offer significant potential for automating complex workflows, from software development to customer support. However, LLM agents often underperform due to suboptimal configurations; poorly tuned prompts, tool descriptions, and parameters that typically require weeks of manual refinement. Existing optimization methods either are too complex for general use or treat components in isolation, missing critical interdependencies. We present ARTEMIS, a no-code evolutionary optimization platform that jointly optimizes agent configurations through semantically-aware genetic operators. Given only a benchmark script and natural language goals, ARTEMIS automatically discovers configurable components, extracts performance signals from execution logs, and evolves configurations without requiring architectural modifications. We evaluate ARTEMIS on four representative agent systems: the \emph{ALE Agent} for competitive programming on AtCoder Heuristic Contest, achieving a \textbf{$13.6\%$ improvement} in acceptance rate; the \emph{Mini-SWE Agent} for code optimization on SWE-Perf, with a statistically significant \textbf{10.1\% performance gain}; and the \emph{CrewAI Agent} for cost and mathematical reasoning on Math Odyssey, achieving a statistically significant \textbf{$36.9\%$ reduction} in the number of tokens required for evaluation. We also evaluate the \emph{MathTales-Teacher Agent} powered by a smaller open-source model (Qwen2.5-7B) on GSM8K primary-level mathematics problems, achieving a \textbf{22\% accuracy improvement} and demonstrating that ARTEMIS can optimize agents based on both commercial and local models.

DCAug 17, 2016
The BioDynaMo Project: Creating a Platform for Large-Scale Reproducible Biological Simulations

Lukas Breitwieser, Roman Bauer, Alberto Di Meglio et al.

Computer simulations have become a very powerful tool for scientific research. In order to facilitate research in computational biology, the BioDynaMo project aims at a general platform for biological computer simulations, which should be executable on hybrid cloud computing systems. This paper describes challenges and lessons learnt during the early stages of the software development process, in the context of implementation issues and the international nature of the collaboration.

NEJul 10, 2016
The BioDynaMo Project

Roman Bauer, Lukas Breitwieser, Alberto Di Meglio et al.

Computer simulations have become a very powerful tool for scientific research. Given the vast complexity that comes with many open scientific questions, a purely analytical or experimental approach is often not viable. For example, biological systems (such as the human brain) comprise an extremely complex organization and heterogeneous interactions across different spatial and temporal scales. In order to facilitate research on such problems, the BioDynaMo project (\url{https://biodynamo.web.cern.ch/}) aims at a general platform for computer simulations for biological research. Since the scientific investigations require extensive computer resources, this platform should be executable on hybrid cloud computing systems, allowing for the efficient use of state-of-the-art computing technology. This paper describes challenges during the early stages of the software development process. In particular, we describe issues regarding the implementation and the highly interdisciplinary as well as international nature of the collaboration. Moreover, we explain the methodologies, the approach, and the lessons learnt by the team during these first stages.