LG MLJun 25, 2020

Background Knowledge Injection for Interpretable Sequence Classification

Severin Gsponer, Luca Costabello, Chan Le Van, Sumit Pai, Christophe Gueret, Georgiana Ifrim, Freddy Lecue

arXiv:2006.14248v12.31 citations

Originality Incremental advance

AI Analysis

This work addresses the need for interpretable models in sequence classification tasks, such as human activity recognition and bioinformatics, but it is incremental as it builds on existing linear classifiers and embeddings.

The paper tackles the problem of achieving interpretability in sequence classification without sacrificing accuracy by introducing a novel algorithm that combines linear classifiers with background knowledge embeddings, and shows that it preserves predictive power while delivering more interpretable models in experiments on human activity recognition and amino acid sequence classification.

Sequence classification is the supervised learning task of building models that predict class labels of unseen sequences of symbols. Although accuracy is paramount, in certain scenarios interpretability is a must. Unfortunately, such trade-off is often hard to achieve since we lack human-independent interpretability metrics. We introduce a novel sequence learning algorithm, that combines (i) linear classifiers - which are known to strike a good balance between predictive power and interpretability, and (ii) background knowledge embeddings. We extend the classic subsequence feature space with groups of symbols which are generated by background knowledge injected via word or graph embeddings, and use this new feature space to learn a linear classifier. We also present a new measure to evaluate the interpretability of a set of symbolic features based on the symbol embeddings. Experiments on human activity recognition from wearables and amino acid sequence classification show that our classification approach preserves predictive power, while delivering more interpretable models.

View on arXiv PDF

Similar