CLAIDec 7, 2023

Self-Supervised Behavior Cloned Transformers are Path Crawlers for Text Games

arXiv:2312.04657v1132 citationsh-index: 21EMNLP
Originality Incremental advance
AI Analysis

This addresses the data dependency issue for researchers in text-based AI, though it is incremental as it builds on existing behavior cloning methods.

The paper tackles the problem of generating training data for behavior cloning transformers in text games, achieving about 90% performance of supervised systems across three benchmarks.

In this work, we introduce a self-supervised behavior cloning transformer for text games, which are challenging benchmarks for multi-step reasoning in virtual environments. Traditionally, Behavior Cloning Transformers excel in such tasks but rely on supervised training data. Our approach auto-generates training data by exploring trajectories (defined by common macro-action sequences) that lead to reward within the games, while determining the generality and utility of these trajectories by rapidly training small models then evaluating their performance on unseen development games. Through empirical analysis, we show our method consistently uncovers generalizable training data, achieving about 90\% performance of supervised systems across three benchmark text games.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes