ROLGFeb 18, 2025

Learning a High-quality Robotic Wiping Policy Using Systematic Reward Analysis and Visual-Language Model Based Curriculum

arXiv:2502.12599v12 citationsh-index: 2ICRA
Originality Incremental advance
AI Analysis

This work addresses the challenge of reward engineering in deep reinforcement learning for robotic wiping tasks, with potential applications in industries like manufacturing and healthcare, though it is incremental in improving existing methods.

The paper tackled the problem of autonomous robotic wiping, which requires both high quality and fast completion, by proposing a bounded reward formulation to improve convergence and a visual-language model-based curriculum for hyperparameter tuning. The combined method successfully learned a wiping policy for surfaces with varying curvatures, frictions, and waypoints, which baseline methods failed to achieve.

Autonomous robotic wiping is an important task in various industries, ranging from industrial manufacturing to sanitization in healthcare. Deep reinforcement learning (Deep RL) has emerged as a promising algorithm, however, it often suffers from a high demand for repetitive reward engineering. Instead of relying on manual tuning, we first analyze the convergence of quality-critical robotic wiping, which requires both high-quality wiping and fast task completion, to show the poor convergence of the problem and propose a new bounded reward formulation to make the problem feasible. Then, we further improve the learning process by proposing a novel visual-language model (VLM) based curriculum, which actively monitors the progress and suggests hyperparameter tuning. We demonstrate that the combined method can find a desirable wiping policy on surfaces with various curvatures, frictions, and waypoints, which cannot be learned with the baseline formulation. The demo of this project can be found at: https://sites.google.com/view/highqualitywiping.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes