Multimodal VAE Active Inference Controller
This work addresses the challenge of applying active inference to complex robotic tasks with multimodal data, though it appears incremental as an extension of previous methods.
The authors tackled the problem of scaling active inference to high-dimensional multimodal inputs for continuous control in industrial robot arms, achieving improved tracking, high robustness to noise, and adaptability without retuning.
Active inference, a theoretical construct inspired by brain processing, is a promising alternative to control artificial agents. However, current methods do not yet scale to high-dimensional inputs in continuous control. Here we present a novel active inference torque controller for industrial arms that maintains the adaptive characteristics of previous proprioceptive approaches but also enables large-scale multimodal integration (e.g., raw images). We extended our previous mathematical formulation by including multimodal state representation learning using a linearly coupled multimodal variational autoencoder. We evaluated our model on a simulated 7DOF Franka Emika Panda robot arm and compared its behavior with a previous active inference baseline and the Panda built-in optimized controller. Results showed improved tracking and control in goal-directed reaching due to the increased representation power, high robustness to noise and adaptability in changes on the environmental conditions and robot parameters without the need to relearn the generative models nor parameters retuning.