LGBMSep 7, 2023

Insights Into the Inner Workings of Transformer Models for Protein Function Prediction

arXiv:2309.03631v216 citationsh-index: 12Has Code
Originality Synthesis-oriented
AI Analysis

This work provides incremental insights into model interpretability for protein function prediction, aiding biologists and computational researchers.

The researchers tackled the problem of understanding how transformer models predict protein functions by extending integrated gradients to inspect latent representations, enabling identification of amino acids and transformer heads that align with biological expectations and ground truth annotations.

Motivation: We explored how explainable artificial intelligence (XAI) can help to shed light into the inner workings of neural networks for protein function prediction, by extending the widely used XAI method of integrated gradients such that latent representations inside of transformer models, which were finetuned to Gene Ontology term and Enzyme Commission number prediction, can be inspected too. Results: The approach enabled us to identify amino acids in the sequences that the transformers pay particular attention to, and to show that these relevant sequence parts reflect expectations from biology and chemistry, both in the embedding layer and inside of the model, where we identified transformer heads with a statistically significant correspondence of attribution maps with ground truth sequence annotations (e.g. transmembrane regions, active sites) across many proteins. Availability and Implementation: Source code can be accessed at https://github.com/markuswenzel/xai-proteins .

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes