Felix Thömmes

PF
3papers
Novelty45%
AI Score37

3 Papers

SYApr 28
Inverse Linear-Quadratic Gaussian Differential Games

Lucas Günther, Felix Thömmes, Karl Handwerker et al.

This paper presents a method for solving the Inverse Stochastic Differential Game (ISDG) problem in finite-horizon linear-quadratic Gaussian (LQG) differential games. The objective is to recover cost function parameters of all players, as well as noise scaling parameters of the stochastic system, consistent with observed trajectories. The proposed framework combines (i) estimation of the feedback strategies, (ii) identification of the cost function parameters via a novel reformulation of the coupled Riccati differential equations, and (iii) maximum likelihood estimation of the noise scaling parameters. Simulation results demonstrate that the approach recovers parameters, yielding trajectories that closely match the observed trajectories.

PFSep 13, 2024
Automatic Generation of Fast and Accurate Performance Models for Deep Neural Network Accelerators

Konstantin Lübeck, Alexander Louis-Ferdinand Jung, Felix Wedlich et al.

Implementing Deep Neural Networks (DNNs) on resource-constrained edge devices is a challenging task that requires tailored hardware accelerator architectures and a clear understanding of their performance characteristics when executing the intended AI workload. To facilitate this, we present an automated generation approach for fast performance models to accurately estimate the latency of a DNN mapped onto systematically modeled and concisely described accelerator architectures. Using our accelerator architecture description method, we modeled representative DNN accelerators such as Gemmini, UltraTrail, Plasticine-derived, and a parameterizable systolic array. Together with DNN mappings for those modeled architectures, we perform a combined DNN/hardware dependency graph analysis, which enables us, in the best case, to evaluate only 154 loop kernel iterations to estimate the performance for 4.19 billion instructions achieving a significant speedup. We outperform regression and analytical models in terms of mean absolute percentage error (MAPE) compared to simulation results, while being several magnitudes faster than an RTL simulation.

OCApr 30
Data-Driven Continuous-Time Linear Quadratic Regulator via Closed-Loop and Reinforcement Learning Parameterizations

Armin Gießler, Felix Thömmes, Sören Hohmann

This paper studies data-driven approaches to the continuous-time linear quadratic regulator (LQR) problem based on two existing parameterizations, namely a closed-loop (CL) parameterization from behavioral system theory and an integral reinforcement learning (IRL) parameterization. The CL parameterization characterizes the closed-loop system via a matrix that satisfies equality constraints. While this parameterization has been extensively studied for discrete-time systems, we adapt key results to the continuous-time setting and develop a policy iteration (PI) scheme, derive a data-driven continuous-time algebraic Riccati equation (CARE), and introduce an alternative convex problem formulation. The IRL parameterization utilizes off-policy data to perform policy evaluation, which is then used for PI or value iteration. Within the IRL framework, we derive a policy gradient flow and propose convex reformulations of the LQR problem. Finally, we provide a unified treatment of these parameterizations that enables a systematic understanding of existing approaches and clarifies their structural relationships.