Online Learning of Temporal Dependencies for Sustainable Foraging Problem
This work addresses social dilemmas in multi-agent systems for researchers in AI and sustainability, but it is incremental as it builds on existing methods without achieving broad breakthroughs.
The study tackled the sustainable foraging problem, a dynamic multi-agent environment where agents must balance individual rewards with collective long-term sustainability, by testing online learning methods like Neuro-Evolution and Deep Recurrent Q-Networks with Long Short-Term Memory; it found that Long Short-Term Memory helped single agents develop sustainable strategies but failed in multi-agent scenarios.
The sustainable foraging problem is a dynamic environment testbed for exploring the forms of agent cognition in dealing with social dilemmas in a multi-agent setting. The agents need to resist the temptation of individual rewards through foraging and choose the collective long-term goal of sustainability. We investigate methods of online learning in Neuro-Evolution and Deep Recurrent Q-Networks to enable agents to attempt the problem one-shot as is often required by wicked social problems. We further explore if learning temporal dependencies with Long Short-Term Memory may be able to aid the agents in developing sustainable foraging strategies in the long term. It was found that the integration of Long Short-Term Memory assisted agents in developing sustainable strategies for a single agent, however failed to assist agents in managing the social dilemma that arises in the multi-agent scenario.