CLDec 23, 2024

From Models to Microtheories: Distilling a Model's Topical Knowledge for Grounded Question Answering

arXiv:2412.17701v23 citationsh-index: 31ICLR
Originality Incremental advance
AI Analysis

This addresses the issue of trust and interpretability in language models for users in domains like question answering, though it is incremental as it builds on existing reasoning methods.

The paper tackles the problem of making language models' topical knowledge interpretable and verifiable by distilling it into concise 'microtheories'—sets of sentences that encapsulate core knowledge and entail answers to questions. The result shows improvements in grounding answers (up to +8% more grounded answers) and accuracy (up to +8% absolute) when augmenting corpora with these microtheories.

Recent reasoning methods (e.g., chain-of-thought, entailment reasoning) help users understand how language models (LMs) answer a single question, but they do little to reveal the LM's overall understanding, or "theory," about the question's topic, making it still hard to trust the model. Our goal is to materialize such theories - here called microtheories (a linguistic analog of logical microtheories) - as a set of sentences encapsulating an LM's core knowledge about a topic. These statements systematically work together to entail answers to a set of questions to both engender trust and improve performance. Our approach is to first populate a knowledge store with (model-generated) sentences that entail answers to training questions and then distill those down to a core microtheory that is concise, general, and non-redundant. We show that, when added to a general corpus (e.g., Wikipedia), microtheories can supply critical, topical information not necessarily present in the corpus, improving both a model's ability to ground its answers to verifiable knowledge (i.e., show how answers are systematically entailed by documents in the corpus, fully grounding up to +8% more answers), and the accuracy of those grounded answers (up to +8% absolute). We also show that, in a human evaluation in the medical domain, our distilled microtheories contain a significantly higher concentration of topically critical facts than the non-distilled knowledge store. Finally, we show we can quantify the coverage of a microtheory for a topic (characterized by a dataset) using a notion of $p$-relevance. Together, these suggest that microtheories are an efficient distillation of an LM's topic-relevant knowledge, that they can usefully augment existing corpora, and can provide both performance gains and an interpretable, verifiable window into the model's knowledge of a topic.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes