CLAIOct 6, 2020

Converting the Point of View of Messages Spoken to Virtual Assistants

arXiv:2010.02600v2993 citations
AI Analysis

This addresses a specific usability issue for virtual assistant users by improving message accuracy, though it is incremental as it builds on existing NLP methods.

The paper tackled the problem of virtual assistants incorrectly relaying messages by developing a system to convert the point of view in spoken messages, achieving a BLEU score of 63.8 and METEOR score of 83.0 with T5, and CopyNet achieving a relative perplexity of 1.59 with 37 times fewer parameters than T5.

Virtual Assistants can be quite literal at times. If the user says "tell Bob I love him," most virtual assistants will extract the message "I love him" and send it to the user's contact named Bob, rather than properly converting the message to "I love you." We designed a system to allow virtual assistants to take a voice message from one user, convert the point of view of the message, and then deliver the result to its target user. We developed a rule-based model, which integrates a linear text classification model, part-of-speech tagging, and constituency parsing with rule-based transformation methods. We also investigated Neural Machine Translation (NMT) approaches, including LSTMs, CopyNet, and T5. We explored 5 metrics to gauge both naturalness and faithfulness automatically, and we chose to use BLEU plus METEOR for faithfulness and relative perplexity using a separately trained language model (GPT) for naturalness. Transformer-Copynet and T5 performed similarly on faithfulness metrics, with T5 achieving slight edge, a BLEU score of 63.8 and a METEOR score of 83.0. CopyNet was the most natural, with a relative perplexity of 1.59. CopyNet also has 37 times fewer parameters than T5. We have publicly released our dataset, which is composed of 46,565 crowd-sourced samples.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes