Viacheslav Zakharov

LGSep 24, 2022

Fast Lifelong Adaptive Inverse Reinforcement Learning from Demonstrations

Letian Chen, Sravan Jayanthi, Rohan Paleja et al.

Learning from Demonstration (LfD) approaches empower end-users to teach robots novel tasks via demonstrations of the desired behaviors, democratizing access to robotics. However, current LfD frameworks are not capable of fast adaptation to heterogeneous human demonstrations nor the large-scale deployment in ubiquitous robotics applications. In this paper, we propose a novel LfD framework, Fast Lifelong Adaptive Inverse Reinforcement learning (FLAIR). Our approach (1) leverages learned strategies to construct policy mixtures for fast adaptation to new demonstrations, allowing for quick end-user personalization, (2) distills common knowledge across demonstrations, achieving accurate task inference; and (3) expands its model only when needed in lifelong deployments, maintaining a concise set of prototypical strategies that can approximate all behaviors via policy mixtures. We empirically validate that FLAIR achieves adaptability (i.e., the robot adapts to heterogeneous, user-specific task preferences), efficiency (i.e., the robot achieves sample-efficient adaptation), and scalability (i.e., the model grows sublinearly with the number of demonstrations while maintaining high performance). FLAIR surpasses benchmarks across three control tasks with an average 57% improvement in policy returns and an average 78% fewer episodes required for demonstration modeling using policy mixtures. Finally, we demonstrate the success of FLAIR in a table tennis task and find users rate FLAIR as having higher task (p<.05) and personalization (p<.05) performance.

ROApr 26, 2019

Perceptual Attention-based Predictive Control

Keuntaek Lee, Gabriel Nakajima An, Viacheslav Zakharov et al.

In this paper, we present a novel information processing architecture for safe deep learning-based visual navigation of autonomous systems. The proposed information processing architecture is used to support a perceptual attention-based predictive control algorithm that leverages model predictive control (MPC), convolutional neural networks (CNNs), and uncertainty quantification methods. The novelty of our approach lies in using MPC to learn how to place attention on relevant areas of the visual input, which ultimately allows the system to more rapidly detect unsafe conditions. We accomplish this by using MPC to learn to select regions of interest in the input image, which are used to output control actions as well as estimates of epistemic and aleatoric uncertainty in the attention-aware visual input. We use these uncertainty estimates to quantify the safety of our network controller under the current navigation condition. The proposed architecture and algorithm is tested on a 1:5 scale terrestrial vehicle. Experimental results show that the proposed algorithm outperforms previous approaches on early detection of unsafe conditions, such as when novel obstacles are present in the navigation environment. The proposed architecture is the first step towards using deep learning-based perceptual control policies in safety-critical domains.

Viacheslav Zakharov

2 Papers