LGAug 20, 2025

Multimodal Quantum Vision Transformer for Enzyme Commission Classification from Biochemical Representations

arXiv:2508.14844v11 citationsh-index: 472025 IEEE International Conference on Quantum Artificial Intelligence (QAI)
Originality Incremental advance
AI Analysis

This work addresses enzyme classification for computational biology, offering a novel multimodal approach that is incremental in combining existing quantum and vision transformer methods.

The paper tackled the challenge of predicting enzyme functionality by developing a multimodal Quantum Machine Learning framework that integrates four biochemical modalities, achieving a top-1 accuracy of 85.1% and outperforming sequence-only baselines.

Accurately predicting enzyme functionality remains one of the major challenges in computational biology, particularly for enzymes with limited structural annotations or sequence homology. We present a novel multimodal Quantum Machine Learning (QML) framework that enhances Enzyme Commission (EC) classification by integrating four complementary biochemical modalities: protein sequence embeddings, quantum-derived electronic descriptors, molecular graph structures, and 2D molecular image representations. Quantum Vision Transformer (QVT) backbone equipped with modality-specific encoders and a unified cross-attention fusion module. By integrating graph features and spatial patterns, our method captures key stereoelectronic interactions behind enzyme function. Experimental results demonstrate that our multimodal QVT model achieves a top-1 accuracy of 85.1%, outperforming sequence-only baselines by a substantial margin and achieving better performance results compared to other QML models.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes