CLAIAug 3, 2023

Local Large Language Models for Complex Structured Medical Tasks

arXiv:2308.01727v19 citationsh-index: 4
Originality Synthesis-oriented
AI Analysis

This work addresses complex data extraction and classification tasks in the medical domain, such as processing pathology reports, but it is incremental as it applies existing LLM methods to a new dataset.

The authors tackled the problem of extracting structured condition codes from pathology reports by combining local large language models (LLMs) with domain-specific training, resulting in LLaMA-based models significantly outperforming BERT-style models across all metrics, even with reduced precision.

This paper introduces an approach that combines the language reasoning capabilities of large language models (LLMs) with the benefits of local training to tackle complex, domain-specific tasks. Specifically, the authors demonstrate their approach by extracting structured condition codes from pathology reports. The proposed approach utilizes local LLMs, which can be fine-tuned to respond to specific generative instructions and provide structured outputs. The authors collected a dataset of over 150k uncurated surgical pathology reports, containing gross descriptions, final diagnoses, and condition codes. They trained different model architectures, including LLaMA, BERT and LongFormer and evaluated their performance. The results show that the LLaMA-based models significantly outperform BERT-style models across all evaluated metrics, even with extremely reduced precision. The LLaMA models performed especially well with large datasets, demonstrating their ability to handle complex, multi-label tasks. Overall, this work presents an effective approach for utilizing LLMs to perform domain-specific tasks using accessible hardware, with potential applications in the medical domain, where complex data extraction and classification are required.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes