CLApr 7
ValueGround: Evaluating Culture-Conditioned Visual Value Grounding in MLLMsZhipin Wang, Christoph Leiter, Christian Frey et al.
Cultural values are expressed not only through language but also through visual scenes and everyday social practices. Yet existing evaluations of cultural values in language models are almost entirely text-only, making it unclear whether models can ground culture-conditioned judgments when response options are visualized. We introduce ValueGround, a benchmark for evaluating culture-conditioned visual value grounding in multimodal large language models (MLLMs). Built from World Values Survey (WVS) questions, ValueGround uses minimally contrastive image pairs to represent opposing response options while controlling irrelevant variation. Given a country, a question, and an image pair, a model must choose the image that best matches the country's value tendency without access to the original response-option texts. Across six MLLMs and 13 countries, average accuracy drops from 72.8% in the text-only setting to 65.8% when options are visualized, despite 92.8% accuracy on option-image alignment. Stronger models are more robust, but all remain prone to prediction reversals. Our benchmark provides a controlled testbed for studying cross-modal transfer of culture-conditioned value judgments.
LGMar 15
Learning to Order: Task Sequencing as In-Context OptimizationJan Kobiolka, Christian Frey, Arlind Kadra et al.
Task sequencing (TS) is one of the core open problems in Deep Learning, arising in a plethora of real-world domains, from robotic assembly lines to autonomous driving. Unfortunately, prior work has not convincingly demonstrated the generalization ability of meta-learned TS methods to solve new TS problems, given few initial demonstrations. In this paper, we demonstrate that deep neural networks can meta-learn over an infinite prior of synthetically generated TS problems and achieve a few-shot generalization. We meta-learn a transformer-based architecture over datasets of sequencing trajectories generated from a prior distribution that samples sequencing problems as paths in directed graphs. In a large-scale experiment, we provide ample empirical evidence that our meta-learned models discover optimal task sequences significantly quicker than non-meta-learned baselines.
LGFeb 17
POP: Prior-fitted Optimizer PoliciesJan Kobiolka, Christian Frey, Gresa Shala et al.
Optimization refers to the task of finding extrema of an objective function. Classical gradient-based optimizers are highly sensitive to hyperparameter choices. In highly non-convex settings their performance relies on carefully tuned learning rates, momentum, and gradient accumulation. To address these limitations, we introduce POP (Prior-fitted Optimizer Policies), a meta-learned optimizer that predicts coordinate-wise step sizes conditioned on the contextual information provided in the optimization trajectory. Our model is learned on millions of synthetic optimization problems sampled from a novel prior spanning both convex and non-convex objectives. We evaluate POP on an established benchmark including 47 optimization functions of various complexity, where it consistently outperforms first-order gradient-based methods, non-convex optimization approaches (e.g., evolutionary strategies), Bayesian optimization, and a recent meta-learned competitor under matched budget constraints. Our evaluation demonstrates strong generalization capabilities without task-specific tuning.
LGDec 18, 2025
Towards Reproducibility in Predictive Process Mining: SPICE -- A Deep Learning LibraryOliver Stritzel, Nick Hühnerbein, Simon Rauch et al.
In recent years, Predictive Process Mining (PPM) techniques based on artificial neural networks have evolved as a method for monitoring the future behavior of unfolding business processes and predicting Key Performance Indicators (KPIs). However, many PPM approaches often lack reproducibility, transparency in decision making, usability for incorporating novel datasets and benchmarking, making comparisons among different implementations very difficult. In this paper, we propose SPICE, a Python framework that reimplements three popular, existing baseline deep-learning-based methods for PPM in PyTorch, while designing a common base framework with rigorous configurability to enable reproducible and robust comparison of past and future modelling approaches. We compare SPICE to original reported metrics and with fair metrics on 11 datasets.
CLOct 22, 2025
Zhyper: Factorized Hypernetworks for Conditioned LLM Fine-TuningM. H. I. Abdalla, Zhipin Wang, Christian Frey et al.
Large Language Model (LLM) conditioning refers to instructing an LLM to generate content in accordance with the norms and values of a specific culture, beliefs of a particular political orientation, or any desired text-specified semantic conditioning. Unfortunately, prompt engineering does not ensure that LLMs behave in accordance with a desired conditioning due to the inductive bias of the pre-training and alignment datasets. Prior works have focused on fine-tuning LLMs by directly conditioning the LoRA weights; however, such methods introduce a large number of parameters. As a remedy, we propose Zhyper, a parameter-efficient factorized hypernetwork framework that generates context-aware LoRA adapters from textual descriptions. Experiments on multiple benchmarks show that Zhyper achieves competitive performance with up to 26x fewer parameters than the state-of-the-art baselines. Furthermore, we extend Zhyper to cultural alignment, demonstrating improved generalization to out-of-domain settings and a better capturing of fine-grained contextual values.
ROJun 2, 2020
Workspace monitoring and planning for safe mobile manipulationChristian Frese, Angelika Zube, Christian Frey
In order to enable physical human-robot interaction where humans and (mobile) manipulators share their workspace and work together, robots have to be equipped with important capabilities to guarantee human safety. The robots have to recognize possible collisions with the human co-worker and react anticipatorily by adapting their motion to avert dangerous situations while they are executing their task. Therefore, methods have been developed that allow to monitor the workspace of mobile manipulators using multiple depth sensors to gather information about the robot environment. This encompasses both 3D information about obstacles in the close robot surroundings and the prediction of obstacle motions in the entire monitored space. Based on this information, a collision-free robot motion is planned and during the execution the robot continuously reacts to unforeseen dangerous situations by adapting its planned motion, slowing down or stopping. For the demonstration of a manufacturing scenario, the developed methods have been implemented on a prototypical mobile manipulator. The algorithms handle both robot platform and manipulator in a uniform manner so that an overall optimization of the path and of the collision avoidance behavior is possible. By integrating the monitoring, planning, and interaction control components, the task of grasping, placing and delivering objects to humans in a shared workspace is demonstrated.