Do Wan Kim

SY
4papers
3citations
Novelty43%
AI Score20

4 Papers

SYJul 31, 2023
Continuous-Time Distributed Dynamic Programming for Networked Multi-Agent Markov Decision Processes

Donghwan Lee, Han-Dong Lim, Do Wan Kim

The main goal of this paper is to investigate continuous-time distributed dynamic programming (DP) algorithms for networked multi-agent Markov decision problems (MAMDPs). In our study, we adopt a distributed multi-agent framework where individual agents have access only to their own rewards, lacking insights into the rewards of other agents. Moreover, each agent has the ability to share its parameters with neighboring agents through a communication network, represented by a graph. We first introduce a novel distributed DP, inspired by the distributed optimization method of Wang and Elia. Next, a new distributed DP is introduced through a decoupling process. The convergence of the DP algorithms is proved through systems and control perspectives. The study in this paper sets the stage for new distributed temporal different learning algorithms.

LGApr 22, 2022
Finite-Time Accuracy of Temporal-Difference Learning Under Schur-Stable Recursions

Donghwan Lee, Do Wan Kim

Temporal difference (TD) learning is a cornerstone reinforcement learning (RL) method for policy evaluation, where the goal is to estimate the value function of a Markov decision process under a fixed policy. While a substantial body of work has established its convergence and stability properties, more recent efforts have focused on its statistical efficiency through finite-time error bounds. In this paper, we advance this line of research by developing a new finite-time error analysis for tabular TD learning that directly exploits a discrete-time stochastic linear system representation and leverages Schur stability of the associated matrices. Beyond the specific bounds obtained, the proposed framework provides a reusable template for analyzing TD learning and related RL algorithms, and it offers control-theoretic insights that may guide future developments in finite-sample RL theory.

SYSep 13, 2023
On the Local Quadratic Stability of T-S Fuzzy Systems in the Vicinity of the Origin

Donghwan Lee, Do Wan Kim

The main goal of this paper is to introduce new local stability conditions for continuous-time Takagi-Sugeno (T-S) fuzzy systems. These stability conditions are based on linear matrix inequalities (LMIs) in combination with quadratic Lyapunov functions. Moreover, they integrate information on the membership functions at the origin and effectively leverage the linear structure of the underlying nonlinear system in the vicinity of the origin. As a result, the proposed conditions are proved to be less conservative compared to existing methods using fuzzy Lyapunov functions in the literature. Moreover, we establish that the proposed methods offer necessary and sufficient conditions for the local exponential stability of T-S fuzzy systems. The paper also includes discussions on the inherent limitations associated with fuzzy Lyapunov approaches. To demonstrate the theoretical results, we provide comprehensive examples that elucidate the core concepts and validate the efficacy of the proposed conditions.

AIDec 29, 2021
Control Theoretic Analysis of Temporal Difference Learning

Donghwan Lee, Do Wan Kim

The goal of this manuscript is to conduct a controltheoretic analysis of Temporal Difference (TD) learning algorithms. TD-learning serves as a cornerstone in the realm of reinforcement learning, offering a methodology for approximating the value function associated with a given policy in a Markov Decision Process. Despite several existing works that have contributed to the theoretical understanding of TD-learning, it is only in recent years that researchers have been able to establish concrete guarantees on its statistical efficiency. In this paper, we introduce a finite-time, control-theoretic framework for analyzing TD-learning, leveraging established concepts from the field of linear systems control. Consequently, this paper provides additional insights into the mechanics of TD learning and the broader landscape of reinforcement learning, all while employing straightforward analytical tools derived from control theory.