Madhusudan Basak

CL
h-index7
9papers
1,042citations
Novelty32%
AI Score31

9 Papers

CLAug 17, 2023
Characterizing Information Seeking Events in Health-Related Social Discourse

Omar Sharif, Madhusudan Basak, Tanzia Parvin et al.

Social media sites have become a popular platform for individuals to seek and share health information. Despite the progress in natural language processing for social media mining, a gap remains in analyzing health-related texts on social discourse in the context of events. Event-driven analysis can offer insights into different facets of healthcare at an individual and collective level, including treatment options, misconceptions, knowledge gaps, etc. This paper presents a paradigm to characterize health-related information-seeking in social discourse through the lens of events. Events here are board categories defined with domain experts that capture the trajectory of the treatment/medication. To illustrate the value of this approach, we analyze Reddit posts regarding medications for Opioid Use Disorder (OUD), a critical global health concern. To the best of our knowledge, this is the first attempt to define event categories for characterizing information-seeking in OUD social discourse. Guided by domain experts, we develop TREAT-ISE, a novel multilabel treatment information-seeking event dataset to analyze online discourse on an event-based framework. This dataset contains Reddit posts on information-seeking events related to recovery from OUD, where each post is annotated based on the type of events. We also establish a strong performance benchmark (77.4% F1 score) for the task by employing several machine learning and deep learning classifiers. Finally, we thoroughly investigate the performance and errors of ChatGPT on this task, providing valuable insights into the LLM's capabilities and ongoing characterization efforts.

CLMar 16, 2023
The Scope of In-Context Learning for the Extraction of Medical Temporal Constraints

Parker Seegmiller, Joseph Gatto, Madhusudan Basak et al.

Medications often impose temporal constraints on everyday patient activity. Violations of such medical temporal constraints (MTCs) lead to a lack of treatment adherence, in addition to poor health outcomes and increased healthcare expenses. These MTCs are found in drug usage guidelines (DUGs) in both patient education materials and clinical texts. Computationally representing MTCs in DUGs will advance patient-centric healthcare applications by helping to define safe patient activity patterns. We define a novel taxonomy of MTCs found in DUGs and develop a novel context-free grammar (CFG) based model to computationally represent MTCs from unstructured DUGs. Additionally, we release three new datasets with a combined total of N = 836 DUGs labeled with normalized MTCs. We develop an in-context learning (ICL) solution for automatically extracting and normalizing MTCs found in DUGs, achieving an average F1 score of 0.62 across all datasets. Finally, we rigorously investigate ICL model performance against a baseline model, across datasets and MTC types, and through in-depth error analysis.

CLJan 27, 2023
Theme-driven Keyphrase Extraction to Analyze Social Media Discourse

William Romano, Omar Sharif, Madhusudan Basak et al.

Social media platforms are vital resources for sharing self-reported health experiences, offering rich data on various health topics. Despite advancements in Natural Language Processing (NLP) enabling large-scale social media data analysis, a gap remains in applying keyphrase extraction to health-related content. Keyphrase extraction is used to identify salient concepts in social media discourse without being constrained by predefined entity classes. This paper introduces a theme-driven keyphrase extraction framework tailored for social media, a pioneering approach designed to capture clinically relevant keyphrases from user-generated health texts. Themes are defined as broad categories determined by the objectives of the extraction task. We formulate this novel task of theme-driven keyphrase extraction and demonstrate its potential for efficiently mining social media text for the use case of treatment for opioid use disorder. This paper leverages qualitative and quantitative analysis to demonstrate the feasibility of extracting actionable insights from social media data and efficiently extracting keyphrases using minimally supervised NLP models. Our contributions include the development of a novel data collection and curation framework for theme-driven keyphrase extraction and the creation of MOUD-Keyphrase, the first dataset of its kind comprising human-annotated keyphrases from a Reddit community. We also identify the scope of minimally supervised NLP models to extract keyphrases from social media data efficiently. Lastly, we found that a large language model (ChatGPT) outperforms unsupervised keyphrase extraction models, and we evaluate its efficacy in this task.

CLSep 22, 2022
Scope of Pre-trained Language Models for Detecting Conflicting Health Information

Joseph Gatto, Madhusudan Basak, Sarah M. Preum

An increasing number of people now rely on online platforms to meet their health information needs. Thus identifying inconsistent or conflicting textual health information has become a safety-critical task. Health advice data poses a unique challenge where information that is accurate in the context of one diagnosis can be conflicting in the context of another. For example, people suffering from diabetes and hypertension often receive conflicting health advice on diet. This motivates the need for technologies which can provide contextualized, user-specific health advice. A crucial step towards contextualized advice is the ability to compare health advice statements and detect if and how they are conflicting. This is the task of health conflict detection (HCD). Given two pieces of health advice, the goal of HCD is to detect and categorize the type of conflict. It is a challenging task, as (i) automatically identifying and categorizing conflicts requires a deeper understanding of the semantics of the text, and (ii) the amount of available data is quite limited. In this study, we are the first to explore HCD in the context of pre-trained language models. We find that DeBERTa-v3 performs best with a mean F1 score of 0.68 across all experiments. We additionally investigate the challenges posed by different conflict types and how synthetic data improves a model's understanding of conflict-specific semantics. Finally, we highlight the difficulty in collecting real health conflicts and propose a human-in-the-loop synthetic data augmentation approach to expand existing HCD datasets. Our HCD training dataset is over 2x bigger than the existing HCD dataset and is made publicly available on Github.

CLSep 20, 2020Code
Not Low-Resource Anymore: Aligner Ensembling, Batch Filtering, and New Datasets for Bengali-English Machine Translation

Tahmid Hasan, Abhik Bhattacharjee, Kazi Samin et al.

Despite being the seventh most widely spoken language in the world, Bengali has received much less attention in machine translation literature due to being low in resources. Most publicly available parallel corpora for Bengali are not large enough; and have rather poor quality, mostly because of incorrect sentence alignments resulting from erroneous sentence segmentation, and also because of a high volume of noise present in them. In this work, we build a customized sentence segmenter for Bengali and propose two novel methods for parallel corpus creation on low-resource setups: aligner ensembling and batch filtering. With the segmenter and the two methods combined, we compile a high-quality Bengali-English parallel corpus comprising of 2.75 million sentence pairs, more than 2 million of which were not available before. Training on neural models, we achieve an improvement of more than 9 BLEU score over previous approaches to Bengali-English machine translation. We also evaluate on a new test set of 1000 pairs made with extensive quality control. We release the segmenter, parallel corpus, and the evaluation set, thus elevating Bengali from its low-resource status. To the best of our knowledge, this is the first ever large scale study on Bengali-English machine translation. We believe our study will pave the way for future research on Bengali-English machine translation as well as other low-resource languages. Our data and code are available at https://github.com/csebuetnlp/banglanmt.

CLMar 5, 2024
Scope of Large Language Models for Mining Emerging Opinions in Online Health Discourse

Joseph Gatto, Madhusudan Basak, Yash Srivastava et al.

In this paper, we develop an LLM-powered framework for the curation and evaluation of emerging opinion mining in online health communities. We formulate emerging opinion mining as a pairwise stance detection problem between (title, comment) pairs sourced from Reddit, where post titles contain emerging health-related claims on a topic that is not predefined. The claims are either explicitly or implicitly expressed by the user. We detail (i) a method of claim identification -- the task of identifying if a post title contains a claim and (ii) an opinion mining-driven evaluation framework for stance detection using LLMs. We facilitate our exploration by releasing a novel test dataset, Long COVID-Stance, or LC-stance, which can be used to evaluate LLMs on the tasks of claim identification and stance detection in online health communities. Long Covid is an emerging post-COVID disorder with uncertain and complex treatment guidelines, thus making it a suitable use case for our task. LC-Stance contains long COVID treatment related discourse sourced from a Reddit community. Our evaluation shows that GPT-4 significantly outperforms prior works on zero-shot stance detection. We then perform thorough LLM model diagnostics, identifying the role of claim type (i.e. implicit vs explicit claims) and comment length as sources of model error.

CLApr 1, 2024
Do LLMs Find Human Answers To Fact-Driven Questions Perplexing? A Case Study on Reddit

Parker Seegmiller, Joseph Gatto, Omar Sharif et al.

Large language models (LLMs) have been shown to be proficient in correctly answering questions in the context of online discourse. However, the study of using LLMs to model human-like answers to fact-driven social media questions is still under-explored. In this work, we investigate how LLMs model the wide variety of human answers to fact-driven questions posed on several topic-specific Reddit communities, or subreddits. We collect and release a dataset of 409 fact-driven questions and 7,534 diverse, human-rated answers from 15 r/Ask{Topic} communities across 3 categories: profession, social identity, and geographic location. We find that LLMs are considerably better at modeling highly-rated human answers to such questions, as opposed to poorly-rated human answers. We present several directions for future research based on our initial findings.

HCMar 27, 2025
Socially Constructed Treatment Plans: Analyzing Online Peer Interactions to Understand How Patients Navigate Complex Medical Conditions

Madhusudan Basak, Omar Sharif, Jessica Hulsey et al.

When faced with complex and uncertain medical conditions (e.g., cancer, mental health conditions, recovery from substance dependency), millions of patients seek online peer support. In this study, we leverage content analysis of online discourse and ethnographic studies with clinicians and patient representatives to characterize how treatment plans for complex conditions are "socially constructed." Specifically, we ground online conversation on medication-assisted recovery treatment to medication guidelines and subsequently surface when and why people deviate from the clinical guidelines. We characterize the implications and effectiveness of socially constructed treatment plans through in-depth interviews with clinical experts. Finally, given the enthusiasm around AI-powered solutions for patient communication, we investigate whether and how socially constructed treatment-related knowledge is reflected in a state-of-the-art large language model (LLM). Leveraging a novel mixed-method approach, this study highlights critical research directions for patient-centered communication in online health communities.

CLFeb 24, 2025
REGen: A Reliable Evaluation Framework for Generative Event Argument Extraction

Omar Sharif, Joseph Gatto, Madhusudan Basak et al.

Event argument extraction identifies arguments for predefined event roles in text. Existing work evaluates this task with exact match (EM), where predicted arguments must align exactly with annotated spans. While suitable for span-based models, this approach falls short for large language models (LLMs), which often generate diverse yet semantically accurate arguments. EM severely underestimates performance by disregarding valid variations. Furthermore, EM evaluation fails to capture implicit arguments (unstated but inferable) and scattered arguments (distributed across a document). These limitations underscore the need for an evaluation framework that better captures models' actual performance. To bridge this gap, we introduce REGen, a Reliable Evaluation framework for Generative event argument extraction. REGen combines the strengths of exact, relaxed, and LLM-based matching to better align with human judgment. Experiments on six datasets show that REGen reveals an average performance gain of +23.93 F1 over EM, reflecting capabilities overlooked by prior evaluation. Human validation further confirms REGen's effectiveness, achieving 87.67% alignment with human assessments of argument correctness.