ROAug 7, 2023
MOMA-Force: Visual-Force Imitation for Real-World Mobile ManipulationTaozheng Yang, Ya Jing, Hongtao Wu et al. · bytedance
In this paper, we present a novel method for mobile manipulators to perform multiple contact-rich manipulation tasks. While learning-based methods have the potential to generate actions in an end-to-end manner, they often suffer from insufficient action accuracy and robustness against noise. On the other hand, classical control-based methods can enhance system robustness, but at the cost of extensive parameter tuning. To address these challenges, we present MOMA-Force, a visual-force imitation method that seamlessly combines representation learning for perception, imitation learning for complex motion generation, and admittance whole-body control for system robustness and controllability. MOMA-Force enables a mobile manipulator to learn multiple complex contact-rich tasks with high success rates and small contact forces. In a real household setting, our method outperforms baseline methods in terms of task success rates. Moreover, our method achieves smaller contact forces and smaller force variances compared to baseline methods without force imitation. Overall, we offer a promising approach for efficient and robust mobile manipulation in the real world. Videos and more details can be found on \url{https://visual-force-imitation.github.io}
ROAug 7, 2023
Exploring Visual Pre-training for Robot Manipulation: Datasets, Models and MethodsYa Jing, Xuelin Zhu, Xingbin Liu et al. · bytedance
Visual pre-training with large-scale real-world data has made great progress in recent years, showing great potential in robot learning with pixel observations. However, the recipes of visual pre-training for robot manipulation tasks are yet to be built. In this paper, we thoroughly investigate the effects of visual pre-training strategies on robot manipulation tasks from three fundamental perspectives: pre-training datasets, model architectures and training methods. Several significant experimental findings are provided that are beneficial for robot learning. Further, we propose a visual pre-training scheme for robot manipulation termed Vi-PRoM, which combines self-supervised learning and supervised learning. Concretely, the former employs contrastive learning to acquire underlying patterns from large-scale unlabeled data, while the latter aims learning visual semantics and temporal dynamics. Extensive experiments on robot manipulations in various simulation environments and the real robot demonstrate the superiority of the proposed scheme. Videos and more details can be found on \url{https://explore-pretrain-robot.github.io}.
ROOct 6, 2022
Embodied Referring Expression for Manipulation Question Answering in Interactive EnvironmentQie Sima, Sinan Tan, Huaping Liu
Embodied agents are expected to perform more complicated tasks in an interactive environment, with the progress of Embodied AI in recent years. Existing embodied tasks including Embodied Referring Expression (ERE) and other QA-form tasks mainly focuses on interaction in term of linguistic instruction. Therefore, enabling the agent to manipulate objects in the environment for exploration actively has become a challenging problem for the community. To solve this problem, We introduce a new embodied task: Remote Embodied Manipulation Question Answering (REMQA) to combine ERE with manipulation tasks. In the REMQA task, the agent needs to navigate to a remote position and perform manipulation with the target object to answer the question. We build a benchmark dataset for the REMQA task in the AI2-THOR simulator. To this end, a framework with 3D semantic reconstruction and modular network paradigms is proposed. The evaluation of the proposed framework on the REMQA dataset is presented to validate its effectiveness.
ROApr 21, 2020
Simultaneous Trajectory Optimization and Force Control with Soft Contact MechanicsLasitha Wijayarathne, Qie Sima, Ziyi Zhou et al.
Force modulation of robotic manipulators has been extensively studied for several decades but is not yet commonly used in safety-critical applications due to a lack of accurate interaction contact modeling and weak performance guarantees - a large proportion of them concerning the modulation of interaction forces. This study presents a high-level framework for simultaneous trajectory optimization and force control of the interaction between manipulator and soft environments. Sliding friction and normal contact force are taken into account. The dynamics of the soft contact model and the manipulator dynamics are simultaneously incorporated in the trajectory optimizer to generate desired motion and force profiles. A constraint optimization framework based on Differential Dynamic Programming and Alternative Direction Method of Multipliers has been employed to generate optimal control input and high-dimensional state trajectories. Experimental validation of the model performance is conducted on a soft substrate with known material properties using Cartesian space force control mode. Results show a comparison of ground truth and predicted model based contact force states for a few cartesian motions and the validity range of the friction model. Potential applications include high-level task planning of medical tasks involving manipulation of compliant, delicate, and deformable tissues.