LGNEFeb 10, 2025

Utilizing Novelty-based Evolution Strategies to Train Transformers in Reinforcement Learning

arXiv:2502.06301v21 citationsh-index: 1ICTAI
Originality Synthesis-oriented
AI Analysis

This work addresses the problem of efficiently training complex transformer models in reinforcement learning for AI researchers, but it is incremental as it builds on existing novelty-based methods.

The paper tackled training transformer-based architectures like Decision Transformers in reinforcement learning using novelty-based evolution strategies (NS-ES and NSR-ES), with mixed results: NS-ES showed progress but required many more iterations, while NSR-ES performed similarly on feed-forward models and Decision Transformers as OpenAI-ES did previously.

In this paper, we experiment with novelty-based variants of OpenAI-ES, the NS-ES and NSR-ES algorithms, and evaluate their effectiveness in training complex, transformer-based architectures designed for the problem of reinforcement learning, such as Decision Transformers. We also test if we can accelerate the novelty-based training of these larger models by seeding the training with a pretrained models. The experimental results were mixed. NS-ES showed progress, but it would clearly need many more iterations for it to yield interesting agents. NSR-ES, on the other hand, proved quite capable of being straightforwardly used on larger models, since its performance appears as similar between the feed-forward model and Decision Transformer, as it was for the OpenAI-ES in our previous work.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes