Reparo: Loss-Resilient Generative Codec for Video Conferencing
This addresses video quality degradation and freezing issues for users in real-time video conferencing, representing a novel approach rather than an incremental improvement.
The paper tackles packet loss in video conferencing by introducing Reparo, a generative deep learning framework that generates missing information to prevent video freezes, outperforming state-of-the-art FEC-based solutions with improvements in PSNR, SSIM, and LPIPS metrics.
Packet loss during video conferencing often results in poor quality and video freezing. Retransmitting lost packets is often impractical due to the need for real-time playback, and using Forward Error Correction (FEC) for packet recovery is challenging due to the unpredictable and bursty nature of Internet losses. Excessive redundancy leads to inefficiency and wasted bandwidth, while insufficient redundancy results in undecodable frames, causing video freezes and quality degradation in subsequent frames. We introduce Reparo -- a loss-resilient video conferencing framework based on generative deep learning models to address these issues. Our approach generates missing information when a frame or part of a frame is lost. This generation is conditioned on the data received thus far, considering the model's understanding of how people and objects appear and interact within the visual realm. Experimental results, using publicly available video conferencing datasets, demonstrate that Reparo outperforms state-of-the-art FEC-based video conferencing solutions in terms of both video quality (measured through PSNR, SSIM, and LPIPS) and the occurrence of video freezes.