Human-Perception-Oriented Pseudo Analog Video Transmissions with Deep Learning
This work addresses video multicast quality for users by focusing on perceptual improvements, though it is incremental as it builds on existing pseudo analog methods.
The paper tackles the problem of pseudo analog video transmission by proposing ROIC-Cast, a system that enhances region-of-interest (ROI) quality based on human perception, achieving over 4.1dB PSNR gains for ROI compared to existing systems across various channel conditions.
Recently, pseudo analog transmission has gained increasing attentions due to its ability to alleviate the cliff effect in video multicast scenarios. The existing pseudo analog systems are sorely optimized under the minimum mean squared error criterion without taking the perceptual video quality into consideration. In this paper, we propose a human-perception-based pseudo analog video transmission system named ROIC-Cast, which aims to intelligently enhance the transmission quality of the region-of-interest (ROI) parts. Firstly, the classic deep learning based saliency detection algorithm is adopted to decompose the continuous video sequences into ROI and non-ROI blocks. Secondly, an effective compression method is used to reduce the data amount of side information generated by the ROI extraction module. Then, the power allocation scheme is formulated as a convex problem, and the optimal transmission power for both ROI and non-ROI blocks is derived in a closed form. Finally, the simulations are conducted to validate the proposed system by comparing with a few of existing systems, e.g., KMV-Cast, SoftCast, and DAC-RAN. The proposed ROIC-Cast can achieve over 4.1dB peak signal- to-noise ratio gains of ROI compared with other systems, given the channel signal-to-noise ratio as -5dB, 0dB, 5dB, and 10dB, respectively. This significant performance improvement is due to the automatic ROI extraction, high-efficiency data compression as well as adaptive power allocation.