Toshish Jawale

h-index24
2papers

2 Papers

CLAug 20, 2024
Emotional RAG LLMs: Reading Comprehension for the Open Internet

Benjamin Reichman, Adar Avsian, Kartik Talamadupula et al. · ibm-research

Queries to large language models (LLMs) can be divided into two parts: the instruction/question and the accompanying context. The context for retrieval-augmented generation (RAG) systems in most benchmarks comes from Wikipedia-like texts written in a neutral and factual tone. However, real-world RAG applications often retrieve internet-based text with diverse tones and linguistic styles, posing challenges for downstream tasks. This paper introduces (a) a dataset that transforms RAG-retrieved passages into emotionally inflected and sarcastic text, (b) an emotion translation model for adapting text to different tones, and (c) a prompt-based method to improve LLMs' pragmatic interpretation of retrieved text.

CLMar 8, 2024
Are Human Conversations Special? A Large Language Model Perspective

Toshish Jawale, Chaitanya Animesh, Sekhar Vallath et al. · ibm-research

This study analyzes changes in the attention mechanisms of large language models (LLMs) when used to understand natural conversations between humans (human-human). We analyze three use cases of LLMs: interactions over web content, code, and mathematical texts. By analyzing attention distance, dispersion, and interdependency across these domains, we highlight the unique challenges posed by conversational data. Notably, conversations require nuanced handling of long-term contextual relationships and exhibit higher complexity through their attention patterns. Our findings reveal that while language models exhibit domain-specific attention behaviors, there is a significant gap in their ability to specialize in human conversations. Through detailed attention entropy analysis and t-SNE visualizations, we demonstrate the need for models trained with a diverse array of high-quality conversational data to enhance understanding and generation of human-like dialogue. This research highlights the importance of domain specialization in language models and suggests pathways for future advancement in modeling human conversational nuances.