Dustin A. Carlson

h-index46

3papers

43citations

Novelty57%

AI Score35

Ranked #102,696 of 194,257 authors (top 53%)#22,616 in LG (top 56%)

3 Papers

3.6CVOct 3, 2025

From Scope to Script: An Automated Report Generation Model for Gastrointestinal Endoscopy

Evandros Kaklamanos, Kristjana Kristinsdottir, Jonathan Huang et al.

Endoscopic procedures such as esophagogastroduodenoscopy (EGD) and colonoscopy play a critical role in diagnosing and managing gastrointestinal (GI) disorders. However, the documentation burden associated with these procedures place significant strain on gastroenterologists, contributing to inefficiencies in clinical workflows and physician burnout. To address this challenge, we propose a novel automated report generation model that leverages a transformer-based vision encoder and text decoder within a two-stage training framework. In the first stage, both components are pre-trained on image/text caption pairs to capture generalized vision-language features, followed by fine-tuning on images/report pairs to generate clinically meaningful findings. Our approach not only streamlines the documentation process but also holds promise for reducing physician workload and improving patient care.

3.1LGNov 19, 2021

Esophageal virtual disease landscape using mechanics-informed machine learning

Sourav Halder, Jun Yamasaki, Shashank Acharya et al.

The pathogenesis of esophageal disorders is related to the esophageal wall mechanics. Therefore, to understand the underlying fundamental mechanisms behind various esophageal disorders, it is crucial to map the esophageal wall mechanics-based parameters onto physiological and pathophysiological conditions corresponding to altered bolus transit and supraphysiologic IBP. In this work, we present a hybrid framework that combines fluid mechanics and machine learning to identify the underlying physics of the various esophageal disorders and maps them onto a parameter space which we call the virtual disease landscape (VDL). A one-dimensional inverse model processes the output from an esophageal diagnostic device called endoscopic functional lumen imaging probe (EndoFLIP) to estimate the mechanical "health" of the esophagus by predicting a set of mechanics-based parameters such as esophageal wall stiffness, muscle contraction pattern and active relaxation of esophageal walls. The mechanics-based parameters were then used to train a neural network that consists of a variational autoencoder (VAE) that generates a latent space and a side network that predicts mechanical work metrics for estimating esophagogastric junction motility. The latent vectors along with a set of discrete mechanics-based parameters define the VDL and form clusters corresponding to the various esophageal disorders. The VDL not only distinguishes different disorders but can also be used to predict disease progression in time. Finally, we also demonstrate the clinical applicability of this framework for estimating the effectiveness of a treatment and track patient condition after a treatment.

4.4LGJun 25, 2021

A multi-stage machine learning model on diagnosis of esophageal manometry

Wenjun Kou, Dustin A. Carlson, Alexandra J. Baumann et al.

High-resolution manometry (HRM) is the primary procedure used to diagnose esophageal motility disorders. Its interpretation and classification includes an initial evaluation of swallow-level outcomes and then derivation of a study-level diagnosis based on Chicago Classification (CC), using a tree-like algorithm. This diagnostic approach on motility disordered using HRM was mirrored using a multi-stage modeling framework developed using a combination of various machine learning approaches. Specifically, the framework includes deep-learning models at the swallow-level stage and feature-based machine learning models at the study-level stage. In the swallow-level stage, three models based on convolutional neural networks (CNNs) were developed to predict swallow type, swallow pressurization, and integrated relaxation pressure (IRP). At the study-level stage, model selection from families of the expert-knowledge-based rule models, xgboost models and artificial neural network(ANN) models were conducted, with the latter two model designed and augmented with motivation from the export knowledge. A simple model-agnostic strategy of model balancing motivated by Bayesian principles was utilized, which gave rise to model averaging weighted by precision scores. The averaged (blended) models and individual models were compared and evaluated, of which the best performance on test dataset is 0.81 in top-1 prediction, 0.92 in top-2 predictions. This is the first artificial-intelligence-style model to automatically predict CC diagnosis of HRM study from raw multi-swallow data. Moreover, the proposed modeling framework could be easily extended to multi-modal tasks, such as diagnosis of esophageal patients based on clinical data from both HRM and functional luminal imaging probe panometry (FLIP).