AISep 17, 2017

Improving Search through A3C Reinforcement Learning based Conversational Agent

arXiv:1709.05638v24 citations
Originality Incremental advance
AI Analysis

This addresses the challenge of bootstrapping conversational agents for subjective search domains where data is scarce, though it appears incremental as it builds on existing reinforcement learning methods.

The paper tackles the problem of training conversational search agents for subjective tasks like image search without labeled data by proposing a stochastic virtual user to simulate interactions, and shows that their A3C-based agent achieves higher rewards and better states compared to Q-learning.

We develop a reinforcement learning based search assistant which can assist users through a set of actions and sequence of interactions to enable them realize their intent. Our approach caters to subjective search where the user is seeking digital assets such as images which is fundamentally different from the tasks which have objective and limited search modalities. Labeled conversational data is generally not available in such search tasks and training the agent through human interactions can be time consuming. We propose a stochastic virtual user which impersonates a real user and can be used to sample user behavior efficiently to train the agent which accelerates the bootstrapping of the agent. We develop A3C algorithm based context preserving architecture which enables the agent to provide contextual assistance to the user. We compare the A3C agent with Q-learning and evaluate its performance on average rewards and state values it obtains with the virtual user in validation episodes. Our experiments show that the agent learns to achieve higher rewards and better states.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes