Yang Cao

h-index33

3papers

70citations

Novelty33%

AI Score21

Ranked #181,906 of 194,257 authors (top 94%)#57,044 in CV (top 96%)

3 Papers

6.5CVMar 26, 2024

Panonut360: A Head and Eye Tracking Dataset for Panoramic Video

Yutong Xu, Junhao Du, Jiahe Wang et al.

With the rapid development and widespread application of VR/AR technology, maximizing the quality of immersive panoramic video services that match users' personal preferences and habits has become a long-standing challenge. Understanding the saliency region where users focus, based on data collected with HMDs, can promote multimedia encoding, transmission, and quality assessment. At the same time, large-scale datasets are essential for researchers and developers to explore short/long-term user behavior patterns and train AI models related to panoramic videos. However, existing panoramic video datasets often include low-frequency user head or eye movement data through short-term videos only, lacking sufficient data for analyzing users' Field of View (FoV) and generating video saliency regions. Driven by these practical factors, in this paper, we present a head and eye tracking dataset involving 50 users (25 males and 25 females) watching 15 panoramic videos. The dataset provides details on the viewport and gaze attention locations of users. Besides, we present some statistics samples extracted from the dataset. For example, the deviation between head and eye movements challenges the widely held assumption that gaze attention decreases from the center of the FoV following a Gaussian distribution. Our analysis reveals a consistent downward offset in gaze fixations relative to the FoV in experimental settings involving multiple users and videos. That's why we name the dataset Panonut, a saliency weighting shaped like a donut. Finally, we also provide a script that generates saliency distributions based on given head or eye coordinates and pre-generated saliency distribution map sets of each video from the collected eye tracking data. The dataset is available on website: https://dianvrlab.github.io/Panonut360/.

7.3SPSep 23, 2019

Sensor-Augmented Neural Adaptive Bitrate Video Streaming on UAVs

Xuedou Xiao, Wei Wang, Taobin Chen et al.

Recent advances in unmanned aerial vehicle (UAV) technology have revolutionized a broad class of civil and military applications. However, the designs of wireless technologies that enable real-time streaming of high-definition video between UAVs and ground clients present a conundrum. Most existing adaptive bitrate (ABR) algorithms are not optimized for the air-to-ground links, which usually fluctuate dramatically due to the dynamic flight states of the UAV. In this paper, we present SA-ABR, a new sensor-augmented system that generates ABR video streaming algorithms with the assistance of various kinds of inherent sensor data that are used to pilot UAVs. By incorporating the inherent sensor data with network observations, SA-ABR trains a deep reinforcement learning (DRL) model to extract salient features from the flight state information and automatically learn an ABR algorithm to adapt to the varying UAV channel capacity through the training process. SA-ABR does not rely on any assumptions or models about UAV's flight states or the environment, but instead, it makes decisions by exploiting temporal properties of past throughput through the long short-term memory (LSTM) to adapt itself to a wide range of highly dynamic environments. We have implemented SA-ABR in a commercial UAV and evaluated it in the wild. We compare SA-ABR with a variety of existing state-of-the-art ABR algorithms, and the results show that our system outperforms the best known existing ABR algorithm by 21.4% in terms of the average quality of experience (QoE) reward.

1.2MMMay 13, 2018

Video Processing on the Edge for Multimedia IoT Systems

Yang Cao, Zeyu Xu, Peng Qin et al.

In this article, we first survey the current situation of video processing on the edge for multimedia Internet-of-Things (M-IoT) systems in three typical scenarios, i.e., smart cities, satellite networks, and Internet-of-Vehicles. By summarizing a general model of the edge video processing, the importance of developing an edge computing platform is highlighted. Then, we give a method of implementing cooperative video processing on an edge computing platform based on light-weighted virtualization technologies. Performance evaluation is conducted and some insightful observations can be obtained. Moreover, we summarize challenges and opportunities of realizing effective edge video processing for M-IoT systems.