Real-Time Robot Localization, Vision, and Speech Recognition on Nvidia Jetson TX1
This addresses energy constraints for mobile robots by enabling efficient multi-service integration on embedded platforms, though it is incremental as it builds on existing methods.
The paper tackled integrating real-time localization, vision, and speech recognition services on the Nvidia Jetson TX1 within a 10 W power envelope, achieving this integration and exploring cloud offloading for energy efficiency while meeting real-time requirements.
Robotics systems are complex, often consisted of basic services including SLAM for localization and mapping, Convolution Neural Networks for scene understanding, and Speech Recognition for user interaction, etc. Meanwhile, robots are mobile and usually have tight energy constraints, integrating these services onto an embedded platform with around 10 W of power consumption is critical to the proliferation of mobile robots. In this paper, we present a case study on integrating real-time localization, vision, and speech recognition services on a mobile SoC, Nvidia Jetson TX1, within about 10 W of power envelope. In addition, we explore whether offloading some of the services to cloud platform can lead to further energy efficiency while meeting the real-time requirements