LLM-Augmented Computational Phenotyping of Long Covid
This work addresses the need for personalized interventions in Long COVID by providing a disease-agnostic framework for discovering clinically interpretable subphenotypes, though it is incremental as it builds on existing phenotyping methods with LLM integration.
The study tackled the problem of poorly understood clinical subphenotypes in Long COVID by proposing an LLM-augmented computational phenotyping framework, which identified three distinct phenotypes (Protected, Responder, Refractory) from 13,511 participants with strong statistical separation in symptom severity and disease burden.
Phenotypic characterization is essential for understanding heterogeneity in chronic diseases and for guiding personalized interventions. Long COVID, a complex and persistent condition, yet its clinical subphenotypes remain poorly understood. In this work, we propose an LLM-augmented computational phenotyping framework ``Grace Cycle'' that iteratively integrates hypothesis generation, evidence extraction, and feature refinement to discover clinically meaningful subgroups from longitudinal patient data. The framework identifies three distinct clinical phenotypes, Protected, Responder, and Refractory, based on 13,511 Long Covid participants. These phenotypes exhibit pronounced separation in peak symptom severity, baseline disease burden, and longitudinal dose-response patterns, with strong statistical support across multiple independent dimensions. This study illustrates how large language models can be integrated into a principled, statistically grounded pipeline for phenotypic screening from complex longitudinal data. Note that the proposed framework is disease-agnostic and offers a general approach for discovering clinically interpretable subphenotypes.