AoI-MDP: An AoI Optimized Markov Decision Process (Student Abstract)
For autonomous underwater vehicle navigation, this work addresses observation delay by incorporating AoI into MDP, but the improvement is incremental.
The paper proposes AoI-MDP, which models observation delay as signal delay and integrates Age of Information into the state and reward functions, achieving superior performance over standard MDP in underwater tasks.
Ocean exploration places high demands on autonomous underwater vehicles, especially when there's observation delay. We propose age of information optimized Markov decision process (AoI-MDP) to enhance underwater tasks by modeling observation delay as signal delay and including it in the state space. AoI-MDP also introduces wait time in the action space and integrates AoI with reward functions, optimizing information freshness and decision-making using reinforcement learning. Simulations show AoI-MDP outperforms the standard MDP, demonstrating superior performance, feasibility, and generalization in underwater tasks. To accelerate relevant research, we have made the codes available as open-source at https://github.com/Xiboxtg/AoI-MDP.