CLAIOct 31, 2024

Beyond Label Attention: Transparency in Language Models for Automated Medical Coding via Dictionary Learning

arXiv:2411.00173v225 citationsh-index: 1EMNLP
Originality Incremental advance
AI Analysis

This addresses the need for transparency in medical coding to maintain patient trust, though it is incremental as it builds on existing interpretability methods.

The paper tackled the problem of interpretability in automated medical coding with large language models, where label attention mechanisms often highlight irrelevant tokens, by using dictionary learning to extract sparse representations that elucidate the meaning of over 90% of medically irrelevant tokens.

Medical coding, the translation of unstructured clinical text into standardized medical codes, is a crucial but time-consuming healthcare practice. Though large language models (LLM) could automate the coding process and improve the efficiency of such tasks, interpretability remains paramount for maintaining patient trust. Current efforts in interpretability of medical coding applications rely heavily on label attention mechanisms, which often leads to the highlighting of extraneous tokens irrelevant to the ICD code. To facilitate accurate interpretability in medical language models, this paper leverages dictionary learning that can efficiently extract sparsely activated representations from dense language model embeddings in superposition. Compared with common label attention mechanisms, our model goes beyond token-level representations by building an interpretable dictionary which enhances the mechanistic-based explanations for each ICD code prediction, even when the highlighted tokens are medically irrelevant. We show that dictionary features can steer model behavior, elucidate the hidden meanings of upwards of 90% of medically irrelevant tokens, and are human interpretable.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes