CVMay 9, 2024

A Mixture of Experts Approach to 3D Human Motion Prediction

arXiv:2405.06088v11 citationsHas Code
Originality Incremental advance
AI Analysis

This work addresses real-time motion prediction for applications like autonomous vehicles, but it is incremental as it builds on existing transformer methods.

The paper tackles human motion prediction by replicating a state-of-the-art Spatio-Temporal Transformer model and proposing a novel Mixture of Experts (MoE) architecture to improve real-time inference speed, achieving competitive performance on benchmark datasets.

This project addresses the challenge of human motion prediction, a critical area for applications such as au- tonomous vehicle movement detection. Previous works have emphasized the need for low inference times to provide real time performance for applications like these. Our primary objective is to critically evaluate existing model ar- chitectures, identifying their advantages and opportunities for improvement by replicating the state-of-the-art (SOTA) Spatio-Temporal Transformer model as best as possible given computational con- straints. These models have surpassed the limitations of RNN-based models and have demonstrated the ability to generate plausible motion sequences over both short and long term horizons through the use of spatio-temporal rep- resentations. We also propose a novel architecture to ad- dress challenges of real time inference speed by incorpo- rating a Mixture of Experts (MoE) block within the Spatial- Temporal (ST) attention layer. The particular variation that is used is Soft MoE, a fully-differentiable sparse Transformer that has shown promising ability to enable larger model capacity at lower inference cost. We make out code publicly available at https://github.com/edshieh/motionprediction

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes