Characterizing Internet Video for Large-scale Active Measurements
This work provides updated data for designing ABR algorithms and DASH players, addressing a gap in understanding modern Internet video, but it is incremental as it builds on prior characterization studies.
The authors tackled the lack of recent studies on Internet video characteristics by collecting datasets of over 130,000 YouTube videos from 2013-2014, finding that video length and file size follow a log-normal distribution and that three minutes of video can represent data rate fluctuations.
The availability of high definition video content on the web has brought about a significant change in the characteristics of Internet video, but not many studies on characterizing video have been done after this change. Video characteristics such as video length, format, target bit rate, and resolution provide valuable input to design Adaptive Bit Rate (ABR) algorithms, sizing playout buffers in Dynamic Adaptive HTTP streaming (DASH) players, model the variability in video frame sizes, etc. This paper presents datasets collected in 2013 and 2014 that contains over 130,000 videos from YouTube's most viewed (or most popular) video charts in 58 countries. We describe the basic characteristics of the videos on YouTube for each category, format, video length, file size, and data rate variation, observing that video length and file size fit a log normal distribution. We show that three minutes of a video suffice to represent its instant data rate fluctuation and that we can infer data rate characteristics of different video resolutions from a single given one. Based on our findings, we design active measurements for measuring the performance of Internet video.