LGMar 21, 2025

Curriculum RL meets Monte Carlo Planning: Optimization of a Real World Container Management Problem

arXiv:2503.17194v1h-index: 2ECML/PKDD
Originality Incremental advance
AI Analysis

This addresses the specific problem of safe and efficient container management in real-world waste-sorting facilities, representing an incremental improvement over existing RL approaches.

The paper tackled the problem of container management in waste-sorting facilities, where conventional RL methods fail to balance throughput with safety risks like collisions and overflow. The proposed hybrid method combining curriculum RL with inference-time collision checks reduced safety-limit violations while maintaining high throughput, with experimental results showing significant improvements in collision avoidance.

In this work, we augment reinforcement learning with an inference-time collision model to ensure safe and efficient container management in a waste-sorting facility with limited processing capacity. Each container has two optimal emptying volumes that trade off higher throughput against overflow risk. Conventional reinforcement learning (RL) approaches struggle under delayed rewards, sparse critical events, and high-dimensional uncertainty -- failing to consistently balance higher-volume empties with the risk of safety-limit violations. To address these challenges, we propose a hybrid method comprising: (1) a curriculum-learning pipeline that incrementally trains a PPO agent to handle delayed rewards and class imbalance, and (2) an offline pairwise collision model used at inference time to proactively avert collisions with minimal online cost. Experimental results show that our targeted inference-time collision checks significantly improve collision avoidance, reduce safety-limit violations, maintain high throughput, and scale effectively across varying container-to-PU ratios. These findings offer actionable guidelines for designing safe and efficient container-management systems in real-world facilities.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes