NI LG SYFeb 8, 2024

Attention-Enhanced Prioritized Proximal Policy Optimization for Adaptive Edge Caching

Farnaz Niknia, Ping Wang, Zixu Wang, Aakash Agarwal, Adib S. Rezaei

arXiv:2402.14576v33.34 citationsh-index: 6IEEE Trans Veh Technol

Originality Incremental advance

AI Analysis

This work addresses network traffic issues for edge caching systems, but it is incremental as it builds on existing methods like PPO with an attention enhancement.

This paper tackles the problem of excessive data transmission in networks by developing a caching strategy for edge routers that models the caching problem using a Semi-Markov Decision Process (SMDP) to handle random file requests and integrates file attributes like lifetime, size, and priority, outperforming a recent Deep Reinforcement Learning-based technique in simulations.

This paper tackles the growing issue of excessive data transmission in networks. With increasing traffic, backhaul links and core networks are under significant traffic, leading to the investigation of caching solutions at edge routers. Many existing studies utilize Markov Decision Processes (MDP) to tackle caching problems, often assuming decision points at fixed intervals; however, real-world environments are characterized by random request arrivals. Additionally, critical file attributes such as lifetime, size, and priority significantly impact the effectiveness of caching policies, yet existing research fails to integrate all these attributes in policy design. In this work, we model the caching problem using a Semi-Markov Decision Process (SMDP) to better capture the continuous-time nature of real-world applications, enabling caching decisions to be triggered by random file requests. We then introduce a Proximal Policy Optimization (PPO)--based caching strategy that fully considers file attributes like lifetime, size, and priority. Simulations show that our method outperforms a recent Deep Reinforcement Learning-based technique. To further advance our research, we improved the convergence rate of PPO by prioritizing transitions within the replay buffer through an attention mechanism. This mechanism evaluates the similarity between the current state and all stored transitions, assigning higher priorities to transitions that exhibit greater similarity.

View on arXiv PDF

Similar