Utilizing Novelty-based Evolution Strategies to Train Transformers in Reinforcement Learning
This work addresses the problem of efficiently training complex transformer models in reinforcement learning for AI researchers, but it is incremental as it builds on existing novelty-based methods.
The paper tackled training transformer-based architectures like Decision Transformers in reinforcement learning using novelty-based evolution strategies (NS-ES and NSR-ES), with mixed results: NS-ES showed progress but required many more iterations, while NSR-ES performed similarly on feed-forward models and Decision Transformers as OpenAI-ES did previously.
In this paper, we experiment with novelty-based variants of OpenAI-ES, the NS-ES and NSR-ES algorithms, and evaluate their effectiveness in training complex, transformer-based architectures designed for the problem of reinforcement learning, such as Decision Transformers. We also test if we can accelerate the novelty-based training of these larger models by seeding the training with a pretrained models. The experimental results were mixed. NS-ES showed progress, but it would clearly need many more iterations for it to yield interesting agents. NSR-ES, on the other hand, proved quite capable of being straightforwardly used on larger models, since its performance appears as similar between the feed-forward model and Decision Transformer, as it was for the OpenAI-ES in our previous work.