Preference-Agile Multi-Objective Optimization for Real-time Vehicle Dispatching
For practitioners in logistics and transportation, this work enables real-time multi-objective optimization with dynamic user preferences, addressing a practical gap in existing MOO methods.
The paper proposes a preference-agile multi-objective optimization (PAMOO) method using deep reinforcement learning to allow dynamic adjustment of preferences in real-time vehicle dispatching. Experiments on container terminal dispatching show superior performance and generalization over two popular MOO methods.
Multi-objective optimization (MOO) has been widely studied in literature because of its versatility in human-centered decision making in real-life applications. Recently, demand for dynamic MOO is fast-emerging due to tough market dynamics that require real-time re-adjustments of priorities for different objectives. However, most existing studies focus either on deterministic MOO problems which are not practical, or non-sequential dynamic MOO decision problems that cannot deal with some real-life complexities. To address these challenges, a preference-agile multi-objective optimization (PAMOO) is proposed in this paper to permit users to dynamically adjust and interactively assign the preferences on the fly. To achieve this, a novel uniform model within a deep reinforcement learning (DRL) framework is proposed that can take as inputs users' dynamic preference vectors explicitly. Additionally, a calibration function is fitted to ensure high quality alignment between the preference vector inputs and the output DRL decision policy. Extensive experiments on challenging real-life vehicle dispatching problems at a container terminal showed that PAMOO obtains superior performance and generalization ability when compared with two most popular MOO methods. Our method presents the first dynamic MOO method for challenging \rev{dynamic sequential MOO decision problems