idMotif: An Interactive Motif Identification in Protein Sequences
This work addresses the challenge of motif identification for domain experts in bioinformatics, which is incremental as it builds on existing deep learning methods with interactive visualization.
The authors tackled the problem of identifying motifs in protein sequences by developing idMotif, a visual analytics framework that uses deep learning for categorization and local explanations to discover motif candidates, with a case study demonstrating its utility.
This article introduces idMotif, a visual analytics framework designed to aid domain experts in the identification of motifs within protein sequences. Motifs, short sequences of amino acids, are critical for understanding the distinct functions of proteins. Identifying these motifs is pivotal for predicting diseases or infections. idMotif employs a deep learning-based method for the categorization of protein sequences, enabling the discovery of potential motif candidates within protein groups through local explanations of deep learning model decisions. It offers multiple interactive views for the analysis of protein clusters or groups and their sequences. A case study, complemented by expert feedback, illustrates idMotif's utility in facilitating the analysis and identification of protein sequences and motifs.