P2DT: Mitigating Forgetting in task-incremental Learning with progressive prompt Decision Transformer
This addresses catastrophic forgetting in continual and offline reinforcement learning for intelligent agents, but it appears incremental as it builds on transformer-based models with prompt-based enhancements.
The paper tackles catastrophic forgetting in task-incremental learning for intelligent agents controlled by large models, proposing the Progressive Prompt Decision Transformer (P2DT) which dynamically appends decision tokens to foster task-specific policies, with preliminary results showing effective mitigation of forgetting and good scalability across increasing tasks.
Catastrophic forgetting poses a substantial challenge for managing intelligent agents controlled by a large model, causing performance degradation when these agents face new tasks. In our work, we propose a novel solution - the Progressive Prompt Decision Transformer (P2DT). This method enhances a transformer-based model by dynamically appending decision tokens during new task training, thus fostering task-specific policies. Our approach mitigates forgetting in continual and offline reinforcement learning scenarios. Moreover, P2DT leverages trajectories collected via traditional reinforcement learning from all tasks and generates new task-specific tokens during training, thereby retaining knowledge from previous studies. Preliminary results demonstrate that our model effectively alleviates catastrophic forgetting and scales well with increasing task environments.