AILGMAApr 28, 2020

Improving Sample Efficiency and Multi-Agent Communication in RL-based Train Rescheduling

arXiv:2004.13439v15 citations
Originality Synthesis-oriented
AI Analysis

This work addresses train rescheduling for logistics optimization, but it is incremental as it builds on existing competition entries and presents preliminary results.

The paper tackled train rescheduling using reinforcement learning, achieving sixth place in the Flatland competition, and proposed that policy gradient methods may be unsuitable for high-consequence environments while suggesting communication actions as a potential remedy.

We present preliminary results from our sixth placed entry to the Flatland international competition for train rescheduling, including two improvements for optimized reinforcement learning (RL) training efficiency, and two hypotheses with respect to the prospect of deep RL for complex real-world control tasks: first, that current state of the art policy gradient methods seem inappropriate in the domain of high-consequence environments; second, that learning explicit communication actions (an emerging machine-to-machine language, so to speak) might offer a remedy. These hypotheses need to be confirmed by future work. If confirmed, they hold promises with respect to optimizing highly efficient logistics ecosystems like the Swiss Federal Railways railway network.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes