Seungjun Yi

CL
h-index56
5papers
25citations
Novelty45%
AI Score42

5 Papers

CLJun 30, 2025
Auto-TA: Towards Scalable Automated Thematic Analysis (TA) via Multi-Agent Large Language Models with Reinforcement Learning

Seungjun Yi, Joakim Nguyen, Huimin Xu et al.

Congenital heart disease (CHD) presents complex, lifelong challenges often underrepresented in traditional clinical metrics. While unstructured narratives offer rich insights into patient and caregiver experiences, manual thematic analysis (TA) remains labor-intensive and unscalable. We propose a fully automated large language model (LLM) pipeline that performs end-to-end TA on clinical narratives, which eliminates the need for manual coding or full transcript review. Our system employs a novel multi-agent framework, where specialized LLM agents assume roles to enhance theme quality and alignment with human analysis. To further improve thematic relevance, we optionally integrate reinforcement learning from human feedback (RLHF). This supports scalable, patient-centered analysis of large qualitative datasets and allows LLMs to be fine-tuned for specific clinical contexts.

HCMar 26, 2025
TAMA: A Human-AI Collaborative Thematic Analysis Framework Using Multi-Agent LLMs for Clinical Interviews

Huimin Xu, Seungjun Yi, Terence Lim et al.

Thematic analysis (TA) is a widely used qualitative approach for uncovering latent meanings in unstructured text data. TA provides valuable insights in healthcare but is resource-intensive. Large Language Models (LLMs) have been introduced to perform TA, yet their applications in healthcare remain unexplored. Here, we propose TAMA: A Human-AI Collaborative Thematic Analysis framework using Multi-Agent LLMs for clinical interviews. We leverage the scalability and coherence of multi-agent systems through structured conversations between agents and coordinate the expertise of cardiac experts in TA. Using interview transcripts from parents of children with Anomalous Aortic Origin of a Coronary Artery (AAOCA), a rare congenital heart disease, we demonstrate that TAMA outperforms existing LLM-assisted TA approaches, achieving higher thematic hit rate, coverage, and distinctiveness. TAMA demonstrates strong potential for automated TA in clinical settings by leveraging multi-agent LLM systems with human-in-the-loop integration by enhancing quality while significantly reducing manual workload.

CLSep 18, 2025
Position: Thematic Analysis of Unstructured Clinical Transcripts with Large Language Models

Seungjun Yi, Joakim Nguyen, Terence Lim et al.

This position paper examines how large language models (LLMs) can support thematic analysis of unstructured clinical transcripts, a widely used but resource-intensive method for uncovering patterns in patient and provider narratives. We conducted a systematic review of recent studies applying LLMs to thematic analysis, complemented by an interview with a practicing clinician. Our findings reveal that current approaches remain fragmented across multiple dimensions including types of thematic analysis, datasets, prompting strategies and models used, most notably in evaluation. Existing evaluation methods vary widely (from qualitative expert review to automatic similarity metrics), hindering progress and preventing meaningful benchmarking across studies. We argue that establishing standardized evaluation practices is critical for advancing the field. To this end, we propose an evaluation framework centered on three dimensions: validity, reliability, and interpretability.

GTSep 28, 2025
Beyond Game Theory Optimal: Profit-Maximizing Poker Agents for No-Limit Holdem

SeungHyun Yi, Seungjun Yi

Game theory has grown into a major field over the past few decades, and poker has long served as one of its key case studies. Game-Theory-Optimal (GTO) provides strategies to avoid loss in poker, but pure GTO does not guarantee maximum profit. To this end, we aim to develop a model that outperforms GTO strategies to maximize profit in No Limit Holdem, in heads-up (two-player) and multi-way (more than two-player) situations. Our model finds the GTO foundation and goes further to exploit opponents. The model first navigates toward many simulated poker hands against itself and keeps adjusting its decisions until no action can reliably beat it, creating a strong baseline that is close to the theoretical best strategy. Then, it adapts by observing opponent behavior and adjusting its strategy to capture extra value accordingly. Our results indicate that Monte-Carlo Counterfactual Regret Minimization (CFR) performs best in heads-up situations and CFR remains the strongest method in most multi-way situations. By combining the defensive strength of GTO with real-time exploitation, our approach aims to show how poker agents can move from merely not losing to consistently winning against diverse opponents.

CLSep 21, 2025
SFT-TA: Supervised Fine-Tuned Agents in Multi-Agent LLMs for Automated Inductive Thematic Analysis

Seungjun Yi, Joakim Nguyen, Huimin Xu et al.

Thematic Analysis (TA) is a widely used qualitative method that provides a structured yet flexible framework for identifying and reporting patterns in clinical interview transcripts. However, manual thematic analysis is time-consuming and limits scalability. Recent advances in LLMs offer a pathway to automate thematic analysis, but alignment with human results remains limited. To address these limitations, we propose SFT-TA, an automated thematic analysis framework that embeds supervised fine-tuned (SFT) agents within a multi-agent system. Our framework outperforms existing frameworks and the gpt-4o baseline in alignment with human reference themes. We observed that SFT agents alone may underperform, but achieve better results than the baseline when embedded within a multi-agent system. Our results highlight that embedding SFT agents in specific roles within a multi-agent system is a promising pathway to improve alignment with desired outputs for thematic analysis.