Junjie Cao

h-index6

2papers

421citations

2 Papers

1.5CVFeb 24, 2023

Pose-Controllable 3D Facial Animation Synthesis using Hierarchical Audio-Vertex Attention

Bin Liu, Xiaolin Wei, Bo Li et al.

Most of the existing audio-driven 3D facial animation methods suffered from the lack of detailed facial expression and head pose, resulting in unsatisfactory experience of human-robot interaction. In this paper, a novel pose-controllable 3D facial animation synthesis method is proposed by utilizing hierarchical audio-vertex attention. To synthesize real and detailed expression, a hierarchical decomposition strategy is proposed to encode the audio signal into both a global latent feature and a local vertex-wise control feature. Then the local and global audio features combined with vertex spatial features are used to predict the final consistent facial animation via a graph convolutional neural network by fusing the intrinsic spatial topology structure of the face model and the corresponding semantic feature of the audio. To accomplish pose-controllable animation, we introduce a novel pose attribute augmentation method by utilizing the 2D talking face technique. Experimental results indicate that the proposed method can produce more realistic facial expressions and head posture movements. Qualitative and quantitative experiments show that the proposed method achieves competitive performance against state-of-the-art methods.

2.3CVJun 28, 2020Code

SMPR: Single-Stage Multi-Person Pose Regression

Junqi Lin, Huixin Miao, Junjie Cao et al.

Existing multi-person pose estimators can be roughly divided into two-stage approaches (top-down and bottom-up approaches) and one-stage approaches. The two-stage methods either suffer high computational redundancy for additional person detectors or group keypoints heuristically after predicting all the instance-free keypoints. The recently proposed single-stage methods do not rely on the above two extra stages but have lower performance than the latest bottom-up approaches. In this work, a novel single-stage multi-person pose regression, termed SMPR, is presented. It follows the paradigm of dense prediction and predicts instance-aware keypoints from every location. Besides feature aggregation, we propose better strategies to define positive pose hypotheses for training which all play an important role in dense pose estimation. The network also learns the scores of estimated poses. The pose scoring strategy further improves the pose estimation performance by prioritizing superior poses during non-maximum suppression (NMS). We show that our method not only outperforms existing single-stage methods and but also be competitive with the latest bottom-up methods, with 70.2 AP and 77.5 AP75 on the COCO test-dev pose benchmark. Code is available at https://github.com/cmdi-dlut/SMPR.