Reinforcement Learning Scheduler for Vehicle-to-Vehicle Communications Outside Coverage
This addresses the challenge of maintaining efficient V2V communication for vehicles when network coverage is intermittent, though it appears incremental as it builds on existing RL methods for a specific domain.
The paper tackles the problem of scheduling radio resources for vehicle-to-vehicle communications in out-of-coverage areas by using a reinforcement learning-based centralized scheduler to pre-assign non-interfering resources, achieving performance as good as or better than state-of-the-art distributed schedulers with learning times ranging from a few hundred to a few thousand epochs.
Radio resources in vehicle-to-vehicle (V2V) communication can be scheduled either by a centralized scheduler residing in the network (e.g., a base station in case of cellular systems) or a distributed scheduler, where the resources are autonomously selected by the vehicles. The former approach yields a considerably higher resource utilization in case the network coverage is uninterrupted. However, in case of intermittent or out-of-coverage, due to not having input from centralized scheduler, vehicles need to revert to distributed scheduling. Motivated by recent advances in reinforcement learning (RL), we investigate whether a centralized learning scheduler can be taught to efficiently pre-assign the resources to vehicles for out-of-coverage V2V communication. Specifically, we use the actor-critic RL algorithm to train the centralized scheduler to provide non-interfering resources to vehicles before they enter the out-of-coverage area. Our initial results show that a RL-based scheduler can achieve performance as good as or better than the state-of-art distributed scheduler, often outperforming it. Furthermore, the learning process completes within a reasonable time (ranging from a few hundred to a few thousand epochs), thus making the RL-based scheduler a promising solution for V2V communications with intermittent network coverage.