Volkmar Sterzing

5.3LGAug 11, 2023

Learning Control Policies for Variable Objectives from Offline Data

Marc Weber, Phillip Swazinna, Daniel Hein et al.

Offline reinforcement learning provides a viable approach to obtain advanced control strategies for dynamical systems, in particular when direct interaction with the environment is not available. In this paper, we introduce a conceptual extension for model-based policy search methods, called variable objective policy (VOP). With this approach, policies are trained to generalize efficiently over a variety of objectives, which parameterize the reward function. We demonstrate that by altering the objectives passed as input to the policy, users gain the freedom to adjust its behavior or re-balance optimization targets at runtime, without need for collecting additional observation batches or re-training.

4.9LGOct 12, 2016

Introduction to the "Industrial Benchmark"

Daniel Hein, Alexander Hentschel, Volkmar Sterzing et al.

A novel reinforcement learning benchmark, called Industrial Benchmark, is introduced. The Industrial Benchmark aims at being be realistic in the sense, that it includes a variety of aspects that we found to be vital in industrial applications. It is not designed to be an approximation of any real system, but to pose the same hardness and complexity.

Volkmar Sterzing

2 Papers