Video Generation with Consistency Tuning
This work addresses video quality issues for applications in media and AI, but it appears incremental as it builds on existing video generation techniques.
The paper tackles the problem of jitter and noise in long video generation by proposing a novel framework with four modules to optimize background and foreground consistency, resulting in high-quality videos that outperform state-of-the-art methods.
Currently, various studies have been exploring generation of long videos. However, the generated frames in these videos often exhibit jitter and noise. Therefore, in order to generate the videos without these noise, we propose a novel framework composed of four modules: separate tuning module, average fusion module, combined tuning module, and inter-frame consistency module. By applying our newly proposed modules subsequently, the consistency of the background and foreground in each video frames is optimized. Besides, the experimental results demonstrate that videos generated by our method exhibit a high quality in comparison of the state-of-the-art methods.