IVCVMMJun 15, 2024

Object-Attribute-Relation Representation Based Video Semantic Communication

arXiv:2406.10469v211 citations
AI Analysis

This work addresses the need for interpretable and adaptable video semantic communication for applications like virtual reality and video streaming, representing an incremental improvement over existing joint source-channel coding methods.

The paper tackles the problem of efficient video transmission in low-bandwidth, high-noise settings by introducing an object-attribute-relation (OAR) semantic framework for videos, resulting in a method that outperforms H.265 coding at lower bit-rates and enhances joint source-channel coding for robust transmission.

With the rapid growth of multimedia data volume, there is an increasing need for efficient video transmission in applications such as virtual reality and future video streaming services. Semantic communication is emerging as a vital technique for ensuring efficient and reliable transmission in low-bandwidth, high-noise settings. However, most current approaches focus on joint source-channel coding (JSCC) that depends on end-to-end training. These methods often lack an interpretable semantic representation and struggle with adaptability to various downstream tasks. In this paper, we introduce the use of object-attribute-relation (OAR) as a semantic framework for videos to facilitate low bit-rate coding and enhance the JSCC process for more effective video transmission. We utilize OAR sequences for both low bit-rate representation and generative video reconstruction. Additionally, we incorporate OAR into the image JSCC model to prioritize communication resources for areas more critical to downstream tasks. Our experiments on traffic surveillance video datasets assess the effectiveness of our approach in terms of video transmission performance. The empirical findings demonstrate that our OAR-based video coding method not only outperforms H.265 coding at lower bit-rates but also synergizes with JSCC to deliver robust and efficient video transmission.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes