Large Language Models Produce Responses Perceived to be Empathic
This work addresses the potential for LLMs to enhance empathy in human peer support, though it is incremental as it builds on existing LLM capabilities without introducing new methods.
The study found that large language models (LLMs) generated responses perceived as more empathic than human-written ones in peer support contexts, with human raters consistently favoring LLM outputs across two studies involving 192 and 202 participants.
Large Language Models (LLMs) have demonstrated surprising performance on many tasks, including writing supportive messages that display empathy. Here, we had these models generate empathic messages in response to posts describing common life experiences, such as workplace situations, parenting, relationships, and other anxiety- and anger-eliciting situations. Across two studies (N=192, 202), we showed human raters a variety of responses written by several models (GPT4 Turbo, Llama2, and Mistral), and had people rate these responses on how empathic they seemed to be. We found that LLM-generated responses were consistently rated as more empathic than human-written responses. Linguistic analyses also show that these models write in distinct, predictable ``styles", in terms of their use of punctuation, emojis, and certain words. These results highlight the potential of using LLMs to enhance human peer support in contexts where empathy is important.