Tiyuntsong: A Self-Play Reinforcement Learning Approach for ABR Video Streaming
This addresses video streaming quality for users, but it is incremental as it builds on existing RL-based ABR methods.
The paper tackles the problem of adaptive bitrate (ABR) video streaming by proposing Tiyuntsong, a self-play reinforcement learning approach with a GAN-based method, which improves performance over existing ABR algorithms in underlying metrics.
Existing reinforcement learning~(RL)-based adaptive bitrate~(ABR) approaches outperform the previous fixed control rules based methods by improving the Quality of Experience~(QoE) score, as the QoE metric can hardly provide clear guidance for optimization, finally resulting in the unexpected strategies. In this paper, we propose \emph{Tiyuntsong}, a self-play reinforcement learning approach with generative adversarial network~(GAN)-based method for ABR video streaming. Tiyuntsong learns strategies automatically by training two agents who are competing against each other. Note that the competition results are determined by a set of rules rather than a numerical QoE score that allows clearer optimization objectives. Meanwhile, we propose GAN Enhancement Module to extract hidden features from the past status for preserving the information without the limitations of sequence lengths. Using testbed experiments, we show that the utilization of GAN significantly improves the Tiyuntsong's performance. By comparing the performance of ABRs, we observe that Tiyuntsong also betters existing ABR algorithms in the underlying metrics.