ROAILGSep 4, 2025

Reactive In-Air Clothing Manipulation with Confidence-Aware Dense Correspondence and Visuotactile Affordance

arXiv:2509.03889v12 citationsh-index: 6
Originality Highly original
AI Analysis

This work addresses the problem of robotic clothing manipulation for applications like domestic assistance, offering a more generalizable approach compared to prior methods that flatten garments or assume visibility.

The paper tackles the challenge of manipulating crumpled and suspended clothing by introducing a dual-arm visuotactile framework that combines confidence-aware dense visual correspondence and tactile-supervised grasp affordance, achieving task-agnostic performance in folding and hanging tasks with reactive adaptation to perceptual uncertainty.

Manipulating clothing is challenging due to complex configurations, variable material dynamics, and frequent self-occlusion. Prior systems often flatten garments or assume visibility of key features. We present a dual-arm visuotactile framework that combines confidence-aware dense visual correspondence and tactile-supervised grasp affordance to operate directly on crumpled and suspended garments. The correspondence model is trained on a custom, high-fidelity simulated dataset using a distributional loss that captures cloth symmetries and generates correspondence confidence estimates. These estimates guide a reactive state machine that adapts folding strategies based on perceptual uncertainty. In parallel, a visuotactile grasp affordance network, self-supervised using high-resolution tactile feedback, determines which regions are physically graspable. The same tactile classifier is used during execution for real-time grasp validation. By deferring action in low-confidence states, the system handles highly occluded table-top and in-air configurations. We demonstrate our task-agnostic grasp selection module in folding and hanging tasks. Moreover, our dense descriptors provide a reusable intermediate representation for other planning modalities, such as extracting grasp targets from human video demonstrations, paving the way for more generalizable and scalable garment manipulation.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes