Efficient Reinforcement Learning in Resource Allocation Problems Through Permutation Invariant Multi-task Learning
This work addresses sample efficiency in reinforcement learning for resource allocation problems, though it appears incremental as it builds on existing multi-task learning concepts with specific architectural and sampling improvements.
The paper tackles the challenge of learning from limited training samples in reinforcement learning by exploiting task invariance to increase data availability through multi-task learning, demonstrating effectiveness on financial portfolio optimization and meta federated learning tasks.
One of the main challenges in real-world reinforcement learning is to learn successfully from limited training samples. We show that in certain settings, the available data can be dramatically increased through a form of multi-task learning, by exploiting an invariance property in the tasks. We provide a theoretical performance bound for the gain in sample efficiency under this setting. This motivates a new approach to multi-task learning, which involves the design of an appropriate neural network architecture and a prioritized task-sampling strategy. We demonstrate empirically the effectiveness of the proposed approach on two real-world sequential resource allocation tasks where this invariance property occurs: financial portfolio optimization and meta federated learning.