FC-4DFS: Frequency-controlled Flexible 4D Facial Expression Synthesizing
This addresses the need for more flexible and smooth 4D facial expression synthesis in computer vision and graphics, though it appears incremental as it builds on existing datasets and methods.
The paper tackles the problem of synthesizing 4D facial expression sequences, which lack flexibility and smoothness in current methods, by proposing FC-4DFS, a frequency-controlled method that achieves state-of-the-art generation results on CoMA and Florence4D datasets.
4D facial expression synthesizing is a critical problem in the fields of computer vision and graphics. Current methods lack flexibility and smoothness when simulating the inter-frame motion of expression sequences. In this paper, we propose a frequency-controlled 4D facial expression synthesizing method, FC-4DFS. Specifically, we introduce a frequency-controlled LSTM network to generate 4D facial expression sequences frame by frame from a given neutral landmark with a given length. Meanwhile, we propose a temporal coherence loss to enhance the perception of temporal sequence motion and improve the accuracy of relative displacements. Furthermore, we designed a Multi-level Identity-Aware Displacement Network based on a cross-attention mechanism to reconstruct the 4D facial expression sequences from landmark sequences. Finally, our FC-4DFS achieves flexible and SOTA generation results of 4D facial expression sequences with different lengths on CoMA and Florence4D datasets. The code will be available on GitHub.