Kenneth Bogert

IT
4papers
2citations
Novelty50%
AI Score22

4 Papers

LGAug 15, 2022
IRL with Partial Observations using the Principle of Uncertain Maximum Entropy

Kenneth Bogert, Yikang Gui, Prashant Doshi

The principle of maximum entropy is a broadly applicable technique for computing a distribution with the least amount of information possible while constrained to match empirically estimated feature expectations. However, in many real-world applications that use noisy sensors computing the feature expectations may be challenging due to partial observation of the relevant model variables. For example, a robot performing apprenticeship learning may lose sight of the agent it is learning from due to environmental occlusion. We show that in generalizing the principle of maximum entropy to these types of scenarios we unavoidably introduce a dependency on the learned model to the empirical feature expectations. We introduce the principle of uncertain maximum entropy and present an expectation-maximization based solution generalized from the principle of latent maximum entropy. Finally, we experimentally demonstrate the improved robustness to noisy data offered by our technique in a maximum causal entropy inverse reinforcement learning domain.

ITMay 17, 2023
The Principle of Uncertain Maximum Entropy

Kenneth Bogert, Matthew Kothe

The Principle of Maximum Entropy is a rigorous technique for estimating an unknown distribution given partial information while simultaneously minimizing bias. However, an important requirement for applying the principle is that the available information be provided error-free (Jaynes, 1982). We relax this requirement using a memoryless communication channel as a framework to derive a new, more general principle. We show our new principle provides an upper bound on the entropy of the unknown distribution and the amount of information lost due to the use of a given communications channel is unknown unless the unknown distribution's entropy is also known. Using our new principle we provide a new interpretation of the classic principle and experimentally show its performance relative to the classic principle and some other generally applicable solutions.

ITSep 9, 2021
Notes on Generalizing the Maximum Entropy Principle to Uncertain Data

Kenneth Bogert

The principle of maximum entropy is a broadly applicable technique for computing a distribution with the least amount of information possible constrained to match empirical data, for instance, feature expectations. We seek to generalize this principle to scenarios where the empirical feature expectations cannot be computed because the model variables are only partially observed, which introduces a dependency on the learned model. Generalizing the principle of latent maximum entropy, we introduce uncertain maximum entropy and describe an expectation-maximization based solution to approximately solve these problems. We show that our technique additionally generalizes the principle of maximum entropy. We additionally discuss the use of black box classifiers with our technique, which simplifies the process of utilizing sparse, large data sets.

LGJul 13, 2021
A Hierarchical Bayesian model for Inverse RL in Partially-Controlled Environments

Kenneth Bogert, Prashant Doshi

Robots learning from observations in the real world using inverse reinforcement learning (IRL) may encounter objects or agents in the environment, other than the expert, that cause nuisance observations during the demonstration. These confounding elements are typically removed in fully-controlled environments such as virtual simulations or lab settings. When complete removal is impossible the nuisance observations must be filtered out. However, identifying the source of observations when large amounts of observations are made is difficult. To address this, we present a hierarchical Bayesian model that incorporates both the expert's and the confounding elements' observations thereby explicitly modeling the diverse observations a robot may receive. We extend an existing IRL algorithm originally designed to work under partial occlusion of the expert to consider the diverse observations. In a simulated robotic sorting domain containing both occlusion and confounding elements, we demonstrate the model's effectiveness. In particular, our technique outperforms several other comparative methods, second only to having perfect knowledge of the subject's trajectory.