CLLGMar 17, 2024

Cheap Ways of Extracting Clinical Markers from Texts

arXiv:2403.11227v1104 citationsh-index: 7CLPsych
Originality Synthesis-oriented
AI Analysis

This work addresses suicide risk evaluation in clinical settings, but it is incremental as it compares existing methods without introducing new techniques.

The paper tackled extracting clinical markers from texts for suicide risk assessment by comparing a traditional machine learning pipeline with a large language model (LLM) approach, finding that the LLM method was more resource-intensive but provided guided sequences for evidence synthesis.

This paper describes the work of the UniBuc Archaeology team for CLPsych's 2024 Shared Task, which involved finding evidence within the text supporting the assigned suicide risk level. Two types of evidence were required: highlights (extracting relevant spans within the text) and summaries (aggregating evidence into a synthesis). Our work focuses on evaluating Large Language Models (LLM) as opposed to an alternative method that is much more memory and resource efficient. The first approach employs a good old-fashioned machine learning (GOML) pipeline consisting of a tf-idf vectorizer with a logistic regression classifier, whose representative features are used to extract relevant highlights. The second, more resource intensive, uses an LLM for generating the summaries and is guided by chain-of-thought to provide sequences of text indicating clinical markers.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes