MoST: Motion Style Transformer between Diverse Action Contents
This work addresses a challenge in motion style transfer for computer graphics and animation, offering improved performance for diverse action pairs, though it appears incremental as it builds on existing style transfer concepts.
The paper tackles the problem of motion style transfer between motions with different contents, where existing methods perform poorly, by proposing a novel motion style transformer that disentangles style from content and generates plausible motions with transferred style, outperforming existing methods and achieving high quality without post-processing.
While existing motion style transfer methods are effective between two motions with identical content, their performance significantly diminishes when transferring style between motions with different contents. This challenge lies in the lack of clear separation between content and style of a motion. To tackle this challenge, we propose a novel motion style transformer that effectively disentangles style from content and generates a plausible motion with transferred style from a source motion. Our distinctive approach to achieving the goal of disentanglement is twofold: (1) a new architecture for motion style transformer with `part-attentive style modulator across body parts' and `Siamese encoders that encode style and content features separately'; (2) style disentanglement loss. Our method outperforms existing methods and demonstrates exceptionally high quality, particularly in motion pairs with different contents, without the need for heuristic post-processing. Codes are available at https://github.com/Boeun-Kim/MoST.