STICKERCONV: Generating Multimodal Empathetic Responses from Scratch
It addresses the problem of enhancing empathetic communication in online interactions for dialogue systems, though it appears incremental by building on existing empathetic dialogue research.
The paper tackles the lack of comprehensive datasets for multimodal empathetic dialogue by introducing STICKERCONV, a dataset with 12.9K dialogue sessions and 5.8K unique stickers, and proposes PEGS, a framework that generates contextually relevant and emotionally resonant responses.
Stickers, while widely recognized for enhancing empathetic communication in online interactions, remain underexplored in current empathetic dialogue research, notably due to the challenge of a lack of comprehensive datasets. In this paper, we introduce the Agent for STICKERCONV (Agent4SC), which uses collaborative agent interactions to realistically simulate human behavior with sticker usage, thereby enhancing multimodal empathetic communication. Building on this foundation, we develop a multimodal empathetic dialogue dataset, STICKERCONV, comprising 12.9K dialogue sessions, 5.8K unique stickers, and 2K diverse conversational scenarios. This dataset serves as a benchmark for multimodal empathetic generation. To advance further, we propose PErceive and Generate Stickers (PEGS), a multimodal empathetic response generation framework, complemented by a comprehensive set of empathy evaluation metrics based on LLM. Our experiments demonstrate PEGS's effectiveness in generating contextually relevant and emotionally resonant multimodal empathetic responses, contributing to the advancement of more nuanced and engaging empathetic dialogue systems.