CLNov 6, 2025

T-FIX: Text-Based Explanations with Features Interpretable to eXperts

arXiv:2511.04070v12 citationsh-index: 50
Originality Synthesis-oriented
AI Analysis

This addresses the need for meaningful explanations in expert settings like surgery and astronomy, though it is incremental as it focuses on evaluation rather than new methods.

The paper tackles the problem of evaluating LLM explanations for domain experts by introducing T-FIX, a benchmark across seven knowledge-intensive domains with metrics developed in collaboration with experts to measure alignment with expert judgment.

As LLMs are deployed in knowledge-intensive settings (e.g., surgery, astronomy, therapy), users expect not just answers, but also meaningful explanations for those answers. In these settings, users are often domain experts (e.g., doctors, astrophysicists, psychologists) who require explanations that reflect expert-level reasoning. However, current evaluation schemes primarily emphasize plausibility or internal faithfulness of the explanation, which fail to capture whether the content of the explanation truly aligns with expert intuition. We formalize expert alignment as a criterion for evaluating explanations with T-FIX, a benchmark spanning seven knowledge-intensive domains. In collaboration with domain experts, we develop novel metrics to measure the alignment of LLM explanations with expert judgment.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes