LGMay 27, 2021

Pattern Transfer Learning for Reinforcement Learning in Order Dispatching

arXiv:2105.13218v24 citations
Originality Incremental advance
AI Analysis

This work addresses a domain-specific problem for ride-sharing platforms, offering an incremental improvement by enhancing data re-utilization in non-stationary systems.

The paper tackles the challenge of non-stationarity in ride-sharing order dispatch by proposing a pattern transfer learning framework for reinforcement learning, which incorporates a concordance penalty to capture stable value patterns across environments, resulting in superior performance in experiments.

Order dispatch is one of the central problems to ride-sharing platforms. Recently, value-based reinforcement learning algorithms have shown promising performance on this problem. However, in real-world applications, the non-stationarity of the demand-supply system poses challenges to re-utilizing data generated in different time periods to learn the value function. In this work, motivated by the fact that the relative relationship between the values of some states is largely stable across various environments, we propose a pattern transfer learning framework for value-based reinforcement learning in the order dispatch problem. Our method efficiently captures the value patterns by incorporating a concordance penalty. The superior performance of the proposed method is supported by experiments.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes