CVMar 29, 2022

Neural Face Video Compression using Multiple Views

Anna Volokitin, Stefan Brugger, Ali Benlalah, Sebastian Martin, Brian Amberg, Michael Tschannen

arXiv:2203.15401v210.617 citationsh-index: 36

Originality Incremental advance

AI Analysis

This addresses a specific issue in neural face video compression for applications requiring high-quality video transmission with low bandwidth, but it appears incremental.

The paper tackles the problem of inaccurate reconstructions in neural face video compression caused by relying on a single source frame, by using multiple source frames (views of the face) and presents encouraging results.

Recent advances in deep generative models led to the development of neural face video compression codecs that use an order of magnitude less bandwidth than engineered codecs. These neural codecs reconstruct the current frame by warping a source frame and using a generative model to compensate for imperfections in the warped source frame. Thereby, the warp is encoded and transmitted using a small number of keypoints rather than a dense flow field, which leads to massive savings compared to traditional codecs. However, by relying on a single source frame only, these methods lead to inaccurate reconstructions (e.g. one side of the head becomes unoccluded when turning the head and has to be synthesized). Here, we aim to tackle this issue by relying on multiple source frames (views of the face) and present encouraging results.

View on arXiv PDF

Similar