ExACT: An End-to-End Autonomous Excavator System Using Action Chunking With Transformers
This addresses the need for safer and more efficient autonomous excavators in construction and mining, though it is incremental as it builds on existing imitation learning methods for a specific domain.
The paper tackles the problem of autonomous excavation by introducing ExACT, an end-to-end system that uses imitation learning with Action Chunking with Transformers to control excavator valves directly from raw sensor data, demonstrating the capability to complete tasks like reaching, digging, and dumping with minimal human demonstrations in a simulator.
Excavators are crucial for diverse tasks such as construction and mining, while autonomous excavator systems enhance safety and efficiency, address labor shortages, and improve human working conditions. Different from the existing modularized approaches, this paper introduces ExACT, an end-to-end autonomous excavator system that processes raw LiDAR, camera data, and joint positions to control excavator valves directly. Utilizing the Action Chunking with Transformers (ACT) architecture, ExACT employs imitation learning to take observations from multi-modal sensors as inputs and generate actionable sequences. In our experiment, we build a simulator based on the captured real-world data to model the relations between excavator valve states and joint velocities. With a few human-operated demonstration data trajectories, ExACT demonstrates the capability of completing different excavation tasks, including reaching, digging and dumping through imitation learning in validations with the simulator. To the best of our knowledge, ExACT represents the first instance towards building an end-to-end autonomous excavator system via imitation learning methods with a minimal set of human demonstrations. The video about this work can be accessed at https://youtu.be/NmzR_Rf-aEk.