Seamless Integration and Coordination of Cognitive Skills in Humanoid Robots: A Deep Learning Approach
This work addresses the challenge of integrating multiple cognitive skills for humanoid robots, which is incremental as it builds on existing deep learning approaches to improve coordination in synthetic environments.
The study tackled the problem of coordinating cognitive processes in humanoid robots by developing a deep dynamic neural network model that integrates perception, action, and decision-making through end-to-end learning of visuomotor streams. Results showed the model successfully learned and generalized tutored skills, achieving seamless coordination of cognitive skills like visual perception and intention reading.
This study investigates how adequate coordination among the different cognitive processes of a humanoid robot can be developed through end-to-end learning of direct perception of visuomotor stream. We propose a deep dynamic neural network model built on a dynamic vision network, a motor generation network, and a higher-level network. The proposed model was designed to process and to integrate direct perception of dynamic visuomotor patterns in a hierarchical model characterized by different spatial and temporal constraints imposed on each level. We conducted synthetic robotic experiments in which a robot learned to read human's intention through observing the gestures and then to generate the corresponding goal-directed actions. Results verify that the proposed model is able to learn the tutored skills and to generalize them to novel situations. The model showed synergic coordination of perception, action and decision making, and it integrated and coordinated a set of cognitive skills including visual perception, intention reading, attention switching, working memory, action preparation and execution in a seamless manner. Analysis reveals that coherent internal representations emerged at each level of the hierarchy. Higher-level representation reflecting actional intention developed by means of continuous integration of the lower-level visuo-proprioceptive stream.