Dong Ho Lee

GT
3papers
12citations
Novelty63%
AI Score45

3 Papers

LGMar 2, 2023
Multi-Start Team Orienteering Problem for UAS Mission Re-Planning with Data-Efficient Deep Reinforcement Learning

Dong Ho Lee, Jaemyung Ahn

In this paper, we study the Multi-Start Team Orienteering Problem (MSTOP), a mission re-planning problem where vehicles are initially located away from the depot and have different amounts of fuel. We consider/assume the goal of multiple vehicles is to travel to maximize the sum of collected profits under resource (e.g., time, fuel) consumption constraints. Such re-planning problems occur in a wide range of intelligent UAS applications where changes in the mission environment force the operation of multiple vehicles to change from the original plan. To solve this problem with deep reinforcement learning (RL), we develop a policy network with self-attention on each partial tour and encoder-decoder attention between the partial tour and the remaining nodes. We propose a modified REINFORCE algorithm where the greedy rollout baseline is replaced by a local mini-batch baseline based on multiple, possibly non-duplicate sample rollouts. By drawing multiple samples per training instance, we can learn faster and obtain a stable policy gradient estimator with significantly fewer instances. The proposed training algorithm outperforms the conventional greedy rollout baseline, even when combined with the maximum entropy objective.

58.4GTMar 27
Breaking Exponential Complexity in Games of Ordered Preference: A Tractable Reformulation

Dong Ho Lee, Jingqi Li, Lasse Peters et al.

Games of ordered preference (GOOPs) model multi-player equilibrium problems in which each player maintains a distinct hierarchy of strictly prioritized objectives. Existing approaches solve GOOPs by deriving and enforcing the necessary optimality conditions that characterize lexicographically constrained Nash equilibria through a single-level reformulation. However, the number of primal and dual variables in the resulting KKT system grows exponentially with the number of preference levels, leading to severe scalability challenges. We derive a compact reformulation of these necessary conditions that preserves the essential primal stationarity structure across hierarchy levels, yielding a "reduced" KKT system whose size grows polynomially with both the number of players and the number of preference levels. The reduced system constitutes a relaxation of the complete KKT system, yet it remains a valid necessary condition for local GOOP equilibria. For GOOPs with quadratic objectives and linear constraints, we prove that the primal solution sets of the reduced and complete KKT systems coincide. More generally, for GOOPs with arbitrary (but smooth) nonlinear objectives and constraints, the reduced KKT conditions recover all local GOOP equilibria but may admit spurious non-equilibrium solutions. We introduce a second-order sufficient condition to certify when a candidate point corresponds to a local GOOP equilibrium. We also develop a primal-dual interior-point method for computing a local GOOP equilibrium with local quadratic convergence. The resulting framework enables scalable and efficient computation of GOOP equilibria beyond the tractable range of existing exponentially complex formulations.

59.1GTMay 14
Efficiently Solving Mixed-Hierarchy Games with Quasi-Policy Approximations

Hamzah Khan, Dong Ho Lee, Jingqi Li et al.

Multi-robot coordination often exhibits hierarchical structure, with some robots' decisions depending on the planned behaviors of others. While game theory provides a principled framework for such interactions, existing solvers struggle to handle mixed information structures that combine simultaneous (Nash) and hierarchical (Stackelberg) decision-making. We study N-robot forest-structured mixed-hierarchy games, in which each robot acts as a Stackelberg leader over its subtree while robots in different branches interact via Nash equilibria. We derive the Karush-Kuhn-Tucker (KKT) first-order optimality conditions for this class of games and show that they involve increasingly high-order derivatives of robots' best-response policies as the hierarchy depth grows, rendering a direct solution intractable. To overcome this challenge, we introduce a quasi-policy approximation that removes higher-order policy derivatives and develop an inexact Newton method for efficiently solving the resulting approximated KKT systems. We prove local exponential convergence of the proposed algorithm for games with non-quadratic objectives and nonlinear constraints. The approach is implemented in a highly optimized Julia library (MixedHierarchyGames.jl) and evaluated in hardware and simulated multi-agent experiments, demonstrating real-time convergence for complex mixed-hierarchy information structures.