Ariel Yuhan Ong

96.3CYMar 22Code

Deliberative multi-agent large language models improve clinical reasoning in ophthalmology

Ehsan Misaghi, Sean T Berkowitz, Bing Yu Chen et al.

Large language models (LLMs) show potential for ophthalmic clinical reasoning, yet individual models risk introducing harm. We evaluated whether multi-agent LLM deliberative councils improve diagnostic performance and mitigate harm compared to individual LLMs. In a comparative cross-sectional study, we assessed 12 individual LLMs and three multi-agent councils on 100 ophthalmology clinical vignettes. Each council comprised four models assembled by type: proprietary flagship, proprietary fast, and open-source. Models independently answered a vignette, anonymously ranked one another's responses, and a designated chair synthesized all responses and peer reviews into a final answer. Councils consistently outperformed pooled individual models across all three tiers. Accuracy improved for proprietary flagship (95.0% vs 90.8%; risk difference [RD]: 4.25 [95% CI: 0.45, 8.05]), proprietary fast (96.0% vs 86.5%; RD: 9.50 [5.31, 13.59]), and open-source councils (91.0% vs 83.2%; RD: 7.75 [4.17, 11.33]). Harm rates declined for proprietary flagship (10.0% vs 22.5%; RD: -12.50 [-16.86, -8.14]), proprietary fast (16.0% vs 31.8%; RD: -15.75 [-21.49, -10.01]), and open-source councils (22.0% vs 38.5%; RD: -16.50 [-22.27, -10.73]). Coverage analysis revealed net positive gains for accuracy (ÎCoverage: 4.4-9.8 percentage points) and safety (ÎCoverage: 13.6-20.6), indicating councils recovered correct diagnoses and averted harm. Councils elevated correct diagnoses to higher rank positions; and produced more complete differentials and management plans (all P<.05). Harmful council responses showed reduced combined commission-and-omission errors and tended to be less severe. Structured deliberation via multi-agent LLM councils may enhance the reliability of LLM-assisted ophthalmic clinical reasoning.

49.8HCApr 24

Vibe coding for clinicians: democratising bespoke software development for digital health innovation

Ariel Yuhan Ong, Iain Livingstone, Caroline Kilduff et al.

Clinicians often face workflow problems that are perceived as either too bespoke or low stakes to attract commercial attention. Historically, most do not have the technical knowledge to address these problems, but the recent emergence of "vibe coding" presents a transformative opportunity. Vibe coding refers to the co-development of software using natural language prompts to large language models. It offers a pathway to create simple tools that address these real-world pain points, or to prototype more complex ideas. In this review, written by a group of early adopter clinicians with a range of programming expertise, we introduce vibe coding for clinicians (especially those with no or minimal coding experience) as a way of democratising innovation from the front lines. We discuss foundational skills, outline some common challenges, provide a practical step-by-step playbook, and illustrate this approach with some case examples, taking care to consider caveats and guardrails for deployment. We propose that vibe coding is more than a technical shortcut for beginners and is not a replacement for professional software developers. Instead, it can bridge the gap between clinical insight and technical execution, equipping clinicians with the ability to rapidly prototype digital health solutions most reflective of clinical realities.

Ariel Yuhan Ong

2 Papers