CheckManual: A New Challenge and Benchmark for Manual-based Appliance Manipulation
This work addresses the challenge of robot manipulation of complex appliances by leveraging manuals, which is a domain-specific problem for robotics and AI, though it is incremental as it builds on existing manipulation research by adding manual comprehension.
The authors tackled the problem of enabling robots to use electrical appliances by first reviewing manuals, introducing the CheckManual benchmark for manual-based appliance manipulation with novel challenges, metrics, and simulator environments. They developed a data generation pipeline using CAD models and proposed the ManualPlan model as a baseline, achieving initial performance metrics in this new benchmark.
Correct use of electrical appliances has significantly improved human life quality. Unlike simple tools that can be manipulated with common sense, different parts of electrical appliances have specific functions defined by manufacturers. If we want the robot to heat bread by microwave, we should enable them to review the microwave manual first. From the manual, it can learn about component functions, interaction methods, and representative task steps about appliances. However, previous manual-related works remain limited to question-answering tasks while existing manipulation researchers ignore the manual's important role and fail to comprehend multi-page manuals. In this paper, we propose the first manual-based appliance manipulation benchmark CheckManual. Specifically, we design a large model-assisted human-revised data generation pipeline to create manuals based on CAD appliance models. With these manuals, we establish novel manual-based manipulation challenges, metrics, and simulator environments for model performance evaluation. Furthermore, we propose the first manual-based manipulation planning model ManualPlan to set up a group of baselines for the CheckManual benchmark.