Enhancing Reasoning with Collaboration and Memory
This work addresses reasoning enhancement in AI systems, but it is incremental as it builds on existing methods like chain-of-thought and self-consistency.
The paper tackled the problem of improving reasoning performance in LLMs by studying collaborative multi-agent systems with memory, finding that random exemplar selection often outperforms more principled approaches and that exemplars can sometimes distract models.
We envision a continuous collaborative learning system where groups of LLM agents work together to solve reasoning problems, drawing on memory they collectively build to improve performance as they gain experience. This work establishes the foundations for such a system by studying the interoperability of chain-of-thought reasoning styles, multi-agent collaboration, and memory banks. Extending beyond the identical agents of self-consistency, we introduce varied-context agents with diverse exemplars and a summarizer agent in place of voting. We generate frozen and continuously learned memory banks of exemplars and pair them with fixed, random, and similarity-based retrieval mechanisms. Our systematic study reveals where various methods contribute to reasoning performance of two LLMs on three grounded reasoning tasks, showing that random exemplar selection can often beat more principled approaches, and in some tasks, inclusion of any exemplars serves only to distract both weak and strong models.