SYNov 14, 2019
Potential Game-Based Non-Myopic Sensor Network Planning for Multi-Target TrackingSu-Jin Lee, Soon-Seo Park, Han-Lim Choi
This paper presents a potential game-based method for non-myopic planning of mobile sensor networks in the context of target tracking. The planning objective is to select the sequence of sensing points over more than one future time steps to maximize information about the target states. This multi-step lookahead scheme is studied to overcome getting trapped at local information maximum when there are gaps in sensing coverage due to constraints on the sensor platform mobility or limitations in sensing capabilities. However, the long-term planning becomes computationally intractable as the length of planing horizon increases. This work develops a gametheoretic approach to address the computational challenges. The main contributions of this paper are twofold: (a) to formulate a non-myopic planning problem for tracking multiple targets into a potential game, the size of which linearly increases as the number of planning steps (b) to design a learning algorithm exploiting the joint strategy fictitious play and dynamic programming, which overcomes the gaps in sensing coverage. Numerical examples of multi-target tracking demonstrate that the proposed method gives better estimation performance than myopic planning and is computationally tractable.
LGNov 16, 2020
Distilling a Hierarchical Policy for Planning and Control via Representation and Reinforcement LearningJung-Su Ha, Young-Jin Park, Hyeok-Joo Chae et al.
We present a hierarchical planning and control framework that enables an agent to perform various tasks and adapt to a new task flexibly. Rather than learning an individual policy for each particular task, the proposed framework, DISH, distills a hierarchical policy from a set of tasks by representation and reinforcement learning. The framework is based on the idea of latent variable models that represent high-dimensional observations using low-dimensional latent variables. The resulting policy consists of two levels of hierarchy: (i) a planning module that reasons a sequence of latent intentions that would lead to an optimistic future and (ii) a feedback control policy, shared across the tasks, that executes the inferred intention. Because the planning is performed in low-dimensional latent space, the learned policy can immediately be used to solve or adapt to new tasks without additional training. We demonstrate the proposed framework can learn compact representations (3- and 1-dimensional latent states and commands for a humanoid with 197- and 36-dimensional state features and actions) while solving a small number of imitation tasks, and the resulting policy is directly applicable to other types of tasks, i.e., navigation in cluttered environments. Video: https://youtu.be/HQsQysUWOhg
ROMar 14, 2019
Online Gaussian Process State-Space Model: Learning and Planning for Partially Observable Dynamical SystemsSoon-Seo Park, Young-Jin Park, Youngjae Min et al.
This paper proposes an online learning method of Gaussian process state-space model (GP-SSM). GP-SSM is a probabilistic representation learning scheme that represents unknown state transition and/or measurement models as Gaussian processes (GPs). While the majority of prior literature on learning of GP-SSM are focused on processing a given set of time series data, data may arrive and accumulate sequentially over time in most dynamical systems. Storing all such sequential data and updating the model over entire data incur large amount of computational resources in space and time. To overcome this difficulty, we propose a practical method, termed \textit{onlineGPSSM}, that incorporates stochastic variational inference (VI) and online VI with novel formulation. The proposed method mitigates the computational complexity without catastrophic forgetting and also support adaptation to changes in a system and/or a real environments. Furthermore, we present application of onlineGPSSM into the reinforcement learning (RL) of partially observable dynamical systems by integrating onlineGPSSM with Bayesian filtering and trajectory optimization algorithms. Numerical examples are presented to demonstrate applicability of the proposed method.
ROJul 29, 2018
A Distributed ADMM Approach to Non-Myopic Path Planning for Multi-Target TrackingSoon-Seo Park, Youngjae Min, Jung-Su Ha et al.
This paper investigates non-myopic path planning of mobile sensors for multi-target tracking. Such problem has posed a high computational complexity issue and/or the necessity of high-level decision making. Existing works tackle these issues by heuristically assigning targets to each sensing agent and solving the split problem for each agent. However, such heuristic methods reduce the target estimation performance in the absence of considering the changes of target state estimation along time. In this work, we detour the task-assignment problem by reformulating the general non-myopic planning problem to a distributed optimization problem with respect to targets. By combining alternating direction method of multipliers (ADMM) and local trajectory optimization method, we solve the problem and induce consensus (i.e., high-level decisions) automatically among the targets. In addition, we propose a modified receding-horizon control (RHC) scheme and edge-cutting method for efficient real-time operation. The proposed algorithm is validated through simulations in various scenarios.
LGJul 5, 2018
Adaptive Path-Integral Autoencoder: Representation Learning and Planning for Dynamical SystemsJung-Su Ha, Young-Jin Park, Hyeok-Joo Chae et al.
We present a representation learning algorithm that learns a low-dimensional latent dynamical system from high-dimensional \textit{sequential} raw data, e.g., video. The framework builds upon recent advances in amortized inference methods that use both an inference network and a refinement procedure to output samples from a variational distribution given an observation sequence, and takes advantage of the duality between control and inference to approximately solve the intractable inference problem using the path integral control approach. The learned dynamical model can be used to predict and plan the future states; we also present the efficient planning method that exploits the learned low-dimensional latent dynamics. Numerical experiments show that the proposed path-integral control based variational inference method leads to tighter lower bounds in statistical model learning of sequential data. The supplementary video: https://youtu.be/xCp35crUoLQ
ROMar 16, 2016
Topology-Guided Path Integral Approach for Stochastic Optimal Control in Cluttered EnvironmentJung-Su Ha, Soon-Seo Park, Han-Lim Choi
This paper addresses planning and control of robot motion under uncertainty that is formulated as a continuous-time, continuous-space stochastic optimal control problem, by developing a topology-guided path integral control method. The path integral control framework, which forms the backbone of the proposed method, re-writes the Hamilton-Jacobi-Bellman equation as a statistical inference problem; the resulting inference problem is solved by a sampling procedure that computes the distribution of controlled trajectories around the trajectory by the passive dynamics. For motion control of robots in a highly cluttered environment, however, this sampling can easily be trapped in a local minimum unless the sample size is very large, since the global optimality of local minima depends on the degree of uncertainty. Thus, a homology-embedded sampling-based planner that identifies many (potentially) local-minimum trajectories in different homology classes is developed to aid the sampling process. In combination with a receding-horizon fashion of the optimal control the proposed method produces a dynamically feasible and collision-free motion plans without being trapped in a local minimum. Numerical examples on a synthetic toy problem and on quadrotor control in a complex obstacle field demonstrate the validity of the proposed method.