LG MMFeb 6

Hybrid Feedback-Guided Optimal Learning for Wireless Interactive Panoramic Scene Delivery

Xiaoyi Wu, Juaren Steiger, Bin Li, R. Srikant

arXiv:2602.07273v11.4h-index: 4

Originality Incremental advance

AI Analysis

This addresses bandwidth constraints for wireless interactive panoramic scene delivery, which is crucial for applications like VR/AR, but is an incremental improvement over prior multi-armed bandit approaches.

The paper tackles the problem of efficiently delivering panoramic scenes for immersive applications by formulating it as an online learning task with a two-level hybrid feedback model, and demonstrates that their AdaPort algorithm outperforms state-of-the-art baselines in simulations.

Immersive applications such as virtual and augmented reality impose stringent requirements on frame rate, latency, and synchronization between physical and virtual environments. To meet these requirements, an edge server must render panoramic content, predict user head motion, and transmit a portion of the scene that is large enough to cover the user viewport while remaining within wireless bandwidth constraints. Each portion produces two feedback signals: prediction feedback, indicating whether the selected portion covers the actual viewport, and transmission feedback, indicating whether the corresponding packets are successfully delivered. Prior work models this problem as a multi-armed bandit with two-level bandit feedback, but fails to exploit the fact that prediction feedback can be retrospectively computed for all candidate portions once the user head pose is observed. As a result, prediction feedback constitutes full-information feedback rather than bandit feedback. Motivated by this observation, we introduce a two-level hybrid feedback model that combines full-information and bandit feedback, and formulate the portion selection problem as an online learning task under this setting. We derive an instance-dependent regret lower bound for the hybrid feedback model and propose AdaPort, a hybrid learning algorithm that leverages both feedback types to improve learning efficiency. We further establish an instance-dependent regret upper bound that matches the lower bound asymptotically, and demonstrate through real-world trace driven simulations that AdaPort consistently outperforms state-of-the-art baseline methods.

View on arXiv PDF

Similar