Model-Based Simulation for Optimising Smart Reply
This addresses the problem of improving user experience in communication tools by enhancing reply diversity, though it is incremental as it builds on existing Smart Reply frameworks.
The paper tackles the challenge of generating diverse and relevant reply sets for Smart Reply systems by introducing SimSR, a method that uses model-based simulation to directly optimize for at least one relevant reply, achieving up to 21% improvement in ROUGE score and 18% in Self-ROUGE score on public datasets.
Smart Reply (SR) systems present a user with a set of replies, of which one can be selected in place of having to type out a response. To perform well at this task, a system should be able to effectively present the user with a diverse set of options, to maximise the chance that at least one of them conveys the user's desired response. This is a significant challenge, due to the lack of datasets containing sets of responses to learn from. Resultantly, previous work has focused largely on post-hoc diversification, rather than explicitly learning to predict sets of responses. Motivated by this problem, we present a novel method SimSR, that employs model-based simulation to discover high-value response sets, through simulating possible user responses with a learned world model. Unlike previous approaches, this allows our method to directly optimise the end-goal of SR--maximising the relevance of at least one of the predicted replies. Empirically on two public datasets, when compared to SoTA baselines, our method achieves up to 21% and 18% improvement in ROUGE score and Self-ROUGE score respectively.