HCLGROJul 21, 2023

Systematic Adaptation of Communication-focused Machine Learning Models from Real to Virtual Environments for Human-Robot Collaboration

arXiv:2307.11327v12 citationsh-index: 34
Originality Incremental advance
AI Analysis

This work addresses the problem of computationally or economically prohibitive dataset creation in virtual reality for robotics researchers and developers, offering an incremental solution by adapting existing models.

The paper tackles the challenge of adapting trained deep learning models for hand gesture recognition from real to virtual environments to enable human-robot collaboration, presenting a systematic framework that uses limited virtual datasets and provides guidelines for dataset curation.

Virtual reality has proved to be useful in applications in several fields ranging from gaming, medicine, and training to development of interfaces that enable human-robot collaboration. It empowers designers to explore applications outside of the constraints posed by the real world environment and develop innovative solutions and experiences. Hand gestures recognition which has been a topic of much research and subsequent commercialization in the real world has been possible because of the creation of large, labelled datasets. In order to utilize the power of natural and intuitive hand gestures in the virtual domain for enabling embodied teleoperation of collaborative robots, similarly large datasets must be created so as to keep the working interface easy to learn and flexible enough to add more gestures. Depending on the application, this may be computationally or economically prohibitive. Thus, the adaptation of trained deep learning models that perform well in the real environment to the virtual may be a solution to this challenge. This paper presents a systematic framework for the real to virtual adaptation using limited size of virtual dataset along with guidelines for creating a curated dataset. Finally, while hand gestures have been considered as the communication mode, the guidelines and recommendations presented are generic. These are applicable to other modes such as body poses and facial expressions which have large datasets available in the real domain which must be adapted to the virtual one.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes