Looking for Confirmations: An Effective and Human-Like Visual Dialogue Strategy
This addresses the issue of improving human-like interaction in visual dialogue systems, though it is incremental as it builds on existing beam search methods.
The paper tackled the problem of generating unnatural and ineffective goal-oriented questions in Visual Dialogue tasks by proposing Confirm-it, a model that uses beam search re-ranking to ask confirmatory questions, resulting in more natural and effective dialogues in the GuessWhat?! game.
Generating goal-oriented questions in Visual Dialogue tasks is a challenging and long-standing problem. State-Of-The-Art systems are shown to generate questions that, although grammatically correct, often lack an effective strategy and sound unnatural to humans. Inspired by the cognitive literature on information search and cross-situational word learning, we design Confirm-it, a model based on a beam search re-ranking algorithm that guides an effective goal-oriented strategy by asking questions that confirm the model's conjecture about the referent. We take the GuessWhat?! game as a case-study. We show that dialogues generated by Confirm-it are more natural and effective than beam search decoding without re-ranking.