MMCVSDASSep 19, 2022

AutoLV: Automatic Lecture Video Generator

arXiv:2209.08795v15 citationsh-index: 171
Originality Incremental advance
AI Analysis

This system reduces instructors' workload and enables easier lecture dissemination by changing language and accent, but it is incremental as it builds on existing speech synthesis and talking-head generation methods.

The authors tackled the problem of generating realistic lecture videos from annotated slides, reference voice, and portrait video, resulting in a system that outperforms current approaches in authenticity, naturalness, and accuracy.

We propose an end-to-end lecture video generation system that can generate realistic and complete lecture videos directly from annotated slides, instructor's reference voice and instructor's reference portrait video. Our system is primarily composed of a speech synthesis module with few-shot speaker adaptation and an adversarial learning-based talking-head generation module. It is capable of not only reducing instructors' workload but also changing the language and accent which can help the students follow the lecture more easily and enable a wider dissemination of lecture contents. Our experimental results show that the proposed model outperforms other current approaches in terms of authenticity, naturalness and accuracy. Here is a video demonstration of how our system works, and the outcomes of the evaluation and comparison: https://youtu.be/cY6TYkI0cog.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes