CLNov 23, 2021
Variational Learning for Unsupervised Knowledge Grounded DialogsMayank Mishra, Dhiraj Madan, Gaurav Pandey et al.
Recent methods for knowledge grounded dialogs generate responses by incorporating information from an external textual document. These methods do not require the exact document to be known during training and rely on the use of a retrieval system to fetch relevant documents from a large index. The documents used to generate the responses are modeled as latent variables whose prior probabilities need to be estimated. Models such as RAG and REALM, marginalize the document probabilities over the documents retrieved from the index to define the log likelihood loss function which is optimized end-to-end. In this paper, we develop a variational approach to the above technique wherein, we instead maximize the Evidence Lower bound (ELBO). Using a collection of three publicly available open-conversation datasets, we demonstrate how the posterior distribution, that has information from the ground-truth response, allows for a better approximation of the objective function during training. To overcome the challenges associated with sampling over a large knowledge collection, we develop an efficient approach to approximate the ELBO. To the best of our knowledge we are the first to apply variational training for open-scale unsupervised knowledge grounded dialog systems.
CLSep 9, 2019
Neural Conversational QA: Learning to Reason v.s. Exploiting PatternsNikhil Verma, Abhishek Sharma, Dhiraj Madan et al.
Neural Conversational QA tasks like ShARC require systems to answer questions based on the contents of a given passage. On studying recent state-of-the-art models on the ShARCQA task, we found indications that the models learn spurious clues/patterns in the dataset. Furthermore, we show that a heuristic-based program designed to exploit these patterns can have performance comparable to that of the neural models. In this paper we share our findings about four types of patterns found in the ShARC corpus and describe how neural models exploit them. Motivated by the aforementioned findings, we create and share a modified dataset that has fewer spurious patterns, consequently allowing models to learn better.
AINov 2, 2018
Unsupervised Learning of Interpretable Dialog ModelsDhiraj Madan, Dinesh Raghu, Gaurav Pandey et al.
Recently several deep learning based models have been proposed for end-to-end learning of dialogs. While these models can be trained from data without the need for any additional annotations, it is hard to interpret them. On the other hand, there exist traditional state based dialog systems, where the states of the dialog are discrete and hence easy to interpret. However these states need to be handcrafted and annotated in the data. To achieve the best of both worlds, we propose Latent State Tracking Network (LSTN) using which we learn an interpretable model in unsupervised manner. The model defines a discrete latent variable at each turn of the conversation which can take a finite set of values. Since these discrete variables are not present in the training data, we use EM algorithm to train our model in unsupervised manner. In the experiments, we show that LSTN can help achieve interpretability in dialog models without much decrease in performance compared to end-to-end approaches.
CLOct 29, 2017
Finding Dominant User Utterances And System Responses in ConversationsDhiraj Madan, Sachindra Joshi
There are several dialog frameworks which allow manual specification of intents and rule based dialog flow. The rule based framework provides good control to dialog designers at the expense of being more time consuming and laborious. The job of a dialog designer can be reduced if we could identify pairs of user intents and corresponding responses automatically from prior conversations between users and agents. In this paper we propose an approach to find these frequent user utterances (which serve as examples for intents) and corresponding agent responses. We propose a novel SimCluster algorithm that extends standard K-means algorithm to simultaneously cluster user utterances and agent utterances by taking their adjacency information into account. The method also aligns these clusters to provide pairs of intents and response groups. We compare our results with those produced by using simple Kmeans clustering on a real dataset and observe upto 10% absolute improvement in F1-scores. Through our experiments on synthetic dataset, we show that our algorithm gains more advantage over K-means algorithm when the data has large variance.