CVMMApr 3, 2025

L-LBVC: Long-Term Motion Estimation and Prediction for Learned Bi-Directional Video Compression

arXiv:2504.02560v14 citationsh-index: 10DCC
Originality Incremental advance
AI Analysis

This work addresses a critical bottleneck in learned video compression for applications requiring efficient bi-directional coding, representing a strong incremental advance with specific gains.

The paper tackled the problem of inaccurate long-term motion estimation and prediction in learned bi-directional video compression, which lags behind traditional methods, by proposing L-LBVC with adaptive modules for handling motions and reducing bit costs, resulting in significant performance improvements that surpass previous state-of-the-art methods and even VVC on some datasets.

Recently, learned video compression (LVC) has shown superior performance under low-delay configuration. However, the performance of learned bi-directional video compression (LBVC) still lags behind traditional bi-directional coding. The performance gap mainly arises from inaccurate long-term motion estimation and prediction of distant frames, especially in large motion scenes. To solve these two critical problems, this paper proposes a novel LBVC framework, namely L-LBVC. Firstly, we propose an adaptive motion estimation module that can handle both short-term and long-term motions. Specifically, we directly estimate the optical flows for adjacent frames and non-adjacent frames with small motions. For non-adjacent frames with large motions, we recursively accumulate local flows between adjacent frames to estimate long-term flows. Secondly, we propose an adaptive motion prediction module that can largely reduce the bit cost for motion coding. To improve the accuracy of long-term motion prediction, we adaptively downsample reference frames during testing to match the motion ranges observed during training. Experiments show that our L-LBVC significantly outperforms previous state-of-the-art LVC methods and even surpasses VVC (VTM) on some test datasets under random access configuration.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes