Bootstrapping Cognitive Agents with a Large Language Model
This work addresses the challenge of creating more efficient and interpretable AI agents for specific domains like kitchen tasks, though it appears incremental by integrating existing methods.
The paper tackles the problem of combining noisy general knowledge from large language models with the interpretability and flexibility of cognitive architectures to improve efficiency in embodied agents performing kitchen tasks, resulting in better efficiency compared to agents based solely on large language models.
Large language models contain noisy general knowledge of the world, yet are hard to train or fine-tune. On the other hand cognitive architectures have excellent interpretability and are flexible to update but require a lot of manual work to instantiate. In this work, we combine the best of both worlds: bootstrapping a cognitive-based model with the noisy knowledge encoded in large language models. Through an embodied agent doing kitchen tasks, we show that our proposed framework yields better efficiency compared to an agent based entirely on large language models. Our experiments indicate that large language models are a good source of information for cognitive architectures, and the cognitive architecture in turn can verify and update the knowledge of large language models to a specific domain.