CVCLDec 1, 2018

Learning to Caption Images through a Lifetime by Asking Questions

arXiv:1812.00235v333 citations
Originality Incremental advance
AI Analysis

This work addresses the challenge of lifelong learning for AI agents in real-world settings, though it is incremental as it builds on active learning methods.

The paper tackles the problem of enabling artificial agents to continuously expand their knowledge beyond closed datasets by learning to ask natural language questions to humans, achieving better performance with less human supervision on the MSCOCO dataset compared to baselines.

In order to bring artificial agents into our lives, we will need to go beyond supervised learning on closed datasets to having the ability to continuously expand knowledge. Inspired by a student learning in a classroom, we present an agent that can continuously learn by posing natural language questions to humans. Our agent is composed of three interacting modules, one that performs captioning, another that generates questions and a decision maker that learns when to ask questions by implicitly reasoning about the uncertainty of the agent and expertise of the teacher. As compared to current active learning methods which query images for full captions, our agent is able to ask pointed questions to improve the generated captions. The agent trains on the improved captions, expanding its knowledge. We show that our approach achieves better performance using less human supervision than the baselines on the challenging MSCOCO dataset.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes