LG AINov 26, 2023

Optimally Teaching a Linear Behavior Cloning Agent

Shubham Kumar Bharti, Stephen Wright, Adish Singla, Xiaojin Zhu

arXiv:2311.15399v12.0h-index: 34

Originality Incremental advance

AI Analysis

This addresses the problem of efficient teaching in imitation learning for AI/robotics, though it is incremental as it builds on existing teaching dimension concepts.

The paper tackles the problem of optimally teaching a linear behavior cloning agent by selecting minimal state demonstrations to teach a target policy, presenting an algorithm (TIE) that achieves instance optimal teaching dimension but proving the computational problem is NP-hard, with experimental validation showing efficiency.

We study optimal teaching of Linear Behavior Cloning (LBC) learners. In this setup, the teacher can select which states to demonstrate to an LBC learner. The learner maintains a version space of infinite linear hypotheses consistent with the demonstration. The goal of the teacher is to teach a realizable target policy to the learner using minimum number of state demonstrations. This number is known as the Teaching Dimension(TD). We present a teaching algorithm called ``Teach using Iterative Elimination(TIE)" that achieves instance optimal TD. However, we also show that finding optimal teaching set computationally is NP-hard. We further provide an approximation algorithm that guarantees an approximation ratio of $\log(|A|-1)$ on the teaching dimension. Finally, we provide experimental results to validate the efficiency and effectiveness of our algorithm.

View on arXiv PDF

Similar