CLMMMay 17, 2023

Dual Semantic Knowledge Composed Multimodal Dialog Systems

arXiv:2305.09990v113 citations
Originality Incremental advance
AI Analysis

This work improves multimodal dialog systems for applications like customer service or virtual assistants, but it is incremental as it builds on existing methods by adding relation knowledge and regularization.

The paper tackles the problem of generating textual responses in multimodal task-oriented dialog systems by addressing limitations in ignoring relation knowledge and lacking representation-level regularization, resulting in a novel system (MDS-S2) that shows superiority in experiments on a public dataset.

Textual response generation is an essential task for multimodal task-oriented dialog systems.Although existing studies have achieved fruitful progress, they still suffer from two critical limitations: 1) focusing on the attribute knowledge but ignoring the relation knowledge that can reveal the correlations between different entities and hence promote the response generation}, and 2) only conducting the cross-entropy loss based output-level supervision but lacking the representation-level regularization. To address these limitations, we devise a novel multimodal task-oriented dialog system (named MDS-S2). Specifically, MDS-S2 first simultaneously acquires the context related attribute and relation knowledge from the knowledge base, whereby the non-intuitive relation knowledge is extracted by the n-hop graph walk. Thereafter, considering that the attribute knowledge and relation knowledge can benefit the responding to different levels of questions, we design a multi-level knowledge composition module in MDS-S2 to obtain the latent composed response representation. Moreover, we devise a set of latent query variables to distill the semantic information from the composed response representation and the ground truth response representation, respectively, and thus conduct the representation-level semantic regularization. Extensive experiments on a public dataset have verified the superiority of our proposed MDS-S2. We have released the codes and parameters to facilitate the research community.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes