82.8ROMar 26
ThermoAct:Thermal-Aware Vision-Language-Action Models for Robotic Perception and Decision-MakingYoung-Chae Son, Dae-Kwan Ko, Yoon-Ji Choi et al.
In recent human-robot collaboration environments, there is a growing focus on integrating diverse sensor data beyond visual information to enable safer and more intelligent task execution. Although thermal data can be crucial for enhancing robot safety and operational efficiency, its integration has been relatively overlooked in prior research. This paper proposes a novel Vision-Language-Action (VLA) framework that incorporates thermal information for robot task execution. The proposed system leverages a Vision-Language Model (VLM) as a high-level planner to interpret complex natural language commands and decompose them into simpler sub-tasks. This approach facilitates efficient data collection and robust reasoning for complex operations. Unlike conventional methods that rely solely on visual data, our approach integrates thermal information, enabling the robot to perceive physical properties and proactively ensure environmental safety. Experimental results from real-world task scenarios validate the feasibility of our proposed framework, suggesting its potential to enhance task success rates and safety compared to existing vision-based systems.
RODec 4, 2023
Robot Synesthesia: In-Hand Manipulation with Visuotactile SensingYing Yuan, Haichuan Che, Yuzhe Qin et al.
Executing contact-rich manipulation tasks necessitates the fusion of tactile and visual feedback. However, the distinct nature of these modalities poses significant challenges. In this paper, we introduce a system that leverages visual and tactile sensory inputs to enable dexterous in-hand manipulation. Specifically, we propose Robot Synesthesia, a novel point cloud-based tactile representation inspired by human tactile-visual synesthesia. This approach allows for the simultaneous and seamless integration of both sensory inputs, offering richer spatial information and facilitating better reasoning about robot actions. The method, trained in a simulated environment and then deployed to a real robot, is applicable to various in-hand object rotation tasks. Comprehensive ablations are performed on how the integration of vision and touch can improve reinforcement learning and Sim2Real performance. Our project page is available at https://yingyuan0414.github.io/visuotactile/ .
ROJan 23, 2024
DexTouch: Learning to Seek and Manipulate Objects with Tactile DexterityKang-Won Lee, Yuzhe Qin, Xiaolong Wang et al.
The sense of touch is an essential ability for skillfully performing a variety of tasks, providing the capacity to search and manipulate objects without relying on visual information. In this paper, we introduce a multi-finger robot system designed to manipulate objects using the sense of touch, without relying on vision. For tasks that mimic daily life, the robot uses its sense of touch to manipulate randomly placed objects in dark. The objective of this study is to enable robots to perform blind manipulation by using tactile sensation to compensate for the information gap caused by the absence of vision, given the presence of prior information. Training the policy through reinforcement learning in simulation and transferring the trained policy to the real environment, we demonstrate that blind manipulation can be applied to robots without vision. In addition, the experiments showcase the importance of tactile sensing in the blind manipulation tasks. Our project page is available at https://lee-kangwon.github.io/dextouch/