Conversation as Action Under Uncertainty
This work addresses the challenge of handling uncertainties in spoken dialog systems, which is incremental as it builds on existing inference and decision-making approaches.
The authors tackled the problem of managing uncertainties in continuous spoken dialog by proposing a task-independent, multimodal architecture called Quartet, which introduces four interdependent levels of analysis for robust conversation support, as demonstrated in prototype systems for PowerPoint navigation and receptionist tasks.
Conversations abound with uncetainties of various kinds. Treating conversation as inference and decision making under uncertainty, we propose a task independent, multimodal architecture for supporting robust continuous spoken dialog called Quartet. We introduce four interdependent levels of analysis, and describe representations, inference procedures, and decision strategies for managing uncertainties within and between the levels. We highlight the approach by reviewing interactions between a user and two spoken dialog systems developed using the Quartet architecture: Prsenter, a prototype system for navigating Microsoft PowerPoint presentations, and the Bayesian Receptionist, a prototype system for dealing with tasks typically handled by front desk receptionists at the Microsoft corporate campus.