The Application of ChatGPT in Responding to Questions Related to the Boston Bowel Preparation Scale
This addresses the problem of automating medical image scoring for gastroenterologists, but it is incremental as it applies an existing method to a new domain with limited performance gains.
This study assessed ChatGPT's accuracy and consistency in using the Boston Bowel Preparation Scale for colonoscopy assessment, finding that ChatGPT's accuracy ranged from 48.93% to 62.66% and kappa values from 0.52 to 0.53, which trailed endoscopists' accuracy of 76.68% to 77.83% and kappa values of 0.75 to 0.87.
Background: Colonoscopy, a crucial diagnostic tool in gastroenterology, depends heavily on superior bowel preparation. ChatGPT, a large language model with emergent intelligence which also exhibits potential in medical applications. This study aims to assess the accuracy and consistency of ChatGPT in using the Boston Bowel Preparation Scale (BBPS) for colonoscopy assessment. Methods: We retrospectively collected 233 colonoscopy images from 2020 to 2023. These images were evaluated using the BBPS by 3 senior endoscopists and 3 novice endoscopists. Additionally, ChatGPT also assessed these images, having been divided into three groups and undergone specific Fine-tuning. Consistency was evaluated through two rounds of testing. Results: In the initial round, ChatGPT's accuracy varied between 48.93% and 62.66%, trailing the endoscopists' accuracy of 76.68% to 77.83%. Kappa values for ChatGPT was between 0.52 and 0.53, compared to 0.75 to 0.87 for the endoscopists. Conclusion: While ChatGPT shows promise in bowel preparation scoring, it currently does not match the accuracy and consistency of experienced endoscopists. Future research should focus on in-depth Fine-tuning.