Multilingual Cognitive Impairment Detection in the Era of Foundation Models
This work addresses cognitive impairment detection for multilingual clinical applications, but it is incremental, comparing existing methods without introducing new techniques.
The study tackled cognitive impairment detection from speech transcripts in English, Slovene, and Korean, finding that supervised tabular models with engineered linguistic features and embeddings generally outperformed zero-shot LLMs, with performance gains varying by language in few-shot settings.
We evaluate cognitive impairment (CI) classification from transcripts of speech in English, Slovene, and Korean. We compare zero-shot large language models (LLMs) used as direct classifiers under three input settings -- transcript-only, linguistic-features-only, and combined -- with supervised tabular approaches trained under a leave-one-out protocol. The tabular models operate on engineered linguistic features, transcript embeddings, and early or late fusion of both modalities. Across languages, zero-shot LLMs provide competitive no-training baselines, but supervised tabular models generally perform better, particularly when engineered linguistic features are included and combined with embeddings. Few-shot experiments focusing on embeddings indicate that the value of limited supervision is language-dependent, with some languages benefiting substantially from additional labelled examples while others remain constrained without richer feature representations. Overall, the results suggest that, in small-data CI detection, structured linguistic signals and simple fusion-based classifiers remain strong and reliable signals.