Agent Probing Interaction Policies
This addresses the problem of non-stationary environments in multi-agent systems for reinforcement learning researchers, but appears incremental as it extends an existing framework.
The paper tackles the challenge of non-stationarity in multi-agent reinforcement learning by proposing probing policies to identify opponent agent types, extending the Environmental Probing Interaction Policy framework to multi-agent settings.
Reinforcement learning in a multi agent system is difficult because these systems are inherently non-stationary in nature. In such a case, identifying the type of the opposite agent is crucial and can help us address this non-stationary environment. We have investigated if we can employ some probing policies which help us better identify the type of the other agent in the environment. We've made a simplifying assumption that the other agent has a stationary policy that our probing policy is trying to approximate. Our work extends Environmental Probing Interaction Policy framework to handle multi agent environments.