What's in a Name? Morphological Shortcuts by LLMs in Pharmacology
For developers and users of LLMs in high-stakes medical domains, this reveals a subtle but measurable safety risk from morphological shortcuts.
LLMs rely on morphological affixes in drug names to infer pharmacological properties, leading to overgeneralization and plausible but incorrect clinical content. The study shows that affix cues alone elicit class-level responses, and this behavior is localized to early-mid layers of the models.
The morphological form of a word can often give cues to its meaning, but purely relying on these mappings can lead to overgeneralization in high-stakes domains. In the medical domain, for instance, LLMs can confidently reason about fictitious drugs from their affixes alone (e.g., wugcillin) and generate plausible-looking clinical content. We present a behavioral and mechanistic study of LLM "affix heuristics" in pharmacology. Using fictitious drug names built from real affixes, we show that affix signals alone elicit class-level pharmacological responses. We introduce a framework for identifying whether a model's drug semantics are driven mainly by the affix, the stem, or the drug name as a whole. Applied across 653 drugs, our framework reveals that models often induce drug meaning primarily through affix cues, yet rarely explicitly indicate this reliance, and sometimes incorrectly conflate properties among affix-sharing drugs. Activation patching across models further localizes this behavior to early-mid layers. These findings show that morphological shortcuts pose a subtle but measurable risk to safety.