Let's Give a Voice to Conversational Agents in Virtual Reality
This work addresses the need for easier development of multimodal conversational agents in VR, particularly for digital health applications, but it is incremental as it builds on existing technologies without introducing a new paradigm.
The authors tackled the challenge of developing conversational agents in virtual reality by presenting an open-source architecture that simplifies their creation, enabling voice-based interactions through customizable speech models and demonstrating its application with two prototypes in the digital health domain.
The dialogue experience with conversational agents can be greatly enhanced with multimodal and immersive interactions in virtual reality. In this work, we present an open-source architecture with the goal of simplifying the development of conversational agents operating in virtual environments. The architecture offers the possibility of plugging in conversational agents of different domains and adding custom or cloud-based Speech-To-Text and Text-To-Speech models to make the interaction voice-based. Using this architecture, we present two conversational prototypes operating in the digital health domain developed in Unity for both non-immersive displays and VR headsets.