LG CRNov 7, 2022

Towards learning to explain with concept bottleneck models: mitigating information leakage

Joshua Lockhart, Nicolas Marchesotti, Daniele Magazzeni, Manuela Veloso

arXiv:2211.03656v19.67 citationsh-index: 32

Originality Incremental advance

AI Analysis

This work addresses trust issues in interpretable AI for users relying on concept-based explanations, though it is incremental as it builds on existing methods to fix a specific flaw.

The paper tackled the problem of information leakage in concept bottleneck models when using soft concept labels, which undermines model trust, and demonstrated that Monte-Carlo Dropout can mitigate this leakage to produce more reliable concept predictions.

Concept bottleneck models perform classification by first predicting which of a list of human provided concepts are true about a datapoint. Then a downstream model uses these predicted concept labels to predict the target label. The predicted concepts act as a rationale for the target prediction. Model trust issues emerge in this paradigm when soft concept labels are used: it has previously been observed that extra information about the data distribution leaks into the concept predictions. In this work we show how Monte-Carlo Dropout can be used to attain soft concept predictions that do not contain leaked information.

View on arXiv PDF

Similar