AIFeb 19, 2025

Model Evolution Framework with Genetic Algorithm for Multi-Task Reinforcement Learning

Yan Yu, Wengang Zhou, Yaodong Yang, Wanxuan Lu, Yingyan Hou, Houqiang Li

arXiv:2502.13569v13.3h-index: 67

Originality Incremental advance

AI Analysis

This work addresses resource allocation for multi-task agents in robotics, though it appears incremental as it builds on existing routing network methods.

The paper tackles the challenge of efficiently allocating resources based on task difficulty in multi-task reinforcement learning by proposing a Model Evolution framework with Genetic Algorithm (MEGA), which dynamically adjusts model modules during training and achieves state-of-the-art performance on robotics manipulation tasks in the Meta-World benchmark.

Multi-task reinforcement learning employs a single policy to complete various tasks, aiming to develop an agent with generalizability across different scenarios. Given the shared characteristics of tasks, the agent's learning efficiency can be enhanced through parameter sharing. Existing approaches typically use a routing network to generate specific routes for each task and reconstruct a set of modules into diverse models to complete multiple tasks simultaneously. However, due to the inherent difference between tasks, it is crucial to allocate resources based on task difficulty, which is constrained by the model's structure. To this end, we propose a Model Evolution framework with Genetic Algorithm (MEGA), which enables the model to evolve during training according to the difficulty of the tasks. When the current model is insufficient for certain tasks, the framework will automatically incorporate additional modules, enhancing the model's capabilities. Moreover, to adapt to our model evolution framework, we introduce a genotype module-level model, using binary sequences as genotype policies for model reconstruction, while leveraging a non-gradient genetic algorithm to optimize these genotype policies. Unlike routing networks with fixed output dimensions, our approach allows for the dynamic adjustment of the genotype policy length, enabling it to accommodate models with a varying number of modules. We conducted experiments on various robotics manipulation tasks in the Meta-World benchmark. Our state-of-the-art performance demonstrated the effectiveness of the MEGA framework. We will release our source code to the public.

View on arXiv PDF

Similar