Revisiting the DARPA Communicator Data using Conversation Analysis
This addresses the issue of improving conversational AI for users by pinpointing specific failures, though it is incremental as it applies an existing qualitative method to known data.
The paper tackled the problem of identifying failures in human-computer conversation systems by analyzing swear words as indicators of user frustration, using conversation analysis on DARPA Communicator data, and found that a key failure was the systems' inability to handle mixed initiative at the discourse level.
The state of the art in human computer conversation leaves something to be desired and, indeed, talking to a computer can be down-right annoying. This paper describes an approach to identifying ``opportunities for improvement'' in these systems by looking for abuse in the form of swear words. The premise is that humans swear at computers as a sanction and, as such, swear words represent a point of failure where the system did not behave as it should. Having identified where things went wrong, we can work backward through the transcripts and, using conversation analysis (CA) work out how things went wrong. Conversation analysis is a qualitative methodology and can appear quite alien - indeed unscientific - to those of us from a quantitative background. The paper starts with a description of Conversation analysis in its modern form, and then goes on to apply the methodology to transcripts of frustrated and annoyed users in the DARPA Communicator project. The conclusion is that there is at least one species of failure caused by the inability of the Communicator systems to handle mixed initiative at the discourse structure level. Along the way, I hope to demonstrate that there is an alternative future for computational linguistics that does not rely on larger and larger text corpora.