BMDec 4, 2025Code
STAR-GO: Improving Protein Function Prediction by Learning to Hierarchically Integrate Ontology-Informed Semantic EmbeddingsMehmet Efe Akça, Gökçe Uludoğan, Arzucan Özgür et al.
Accurate prediction of protein function is essential for elucidating molecular mechanisms and advancing biological and therapeutic discovery. Yet experimental annotation lags far behind the rapid growth of protein sequence data. Computational approaches address this gap by associating proteins with Gene Ontology (GO) terms, which encode functional knowledge through hierarchical relations and textual definitions. However, existing models often emphasize one modality over the other, limiting their ability to generalize, particularly to unseen or newly introduced GO terms that frequently arise as the ontology evolves, and making the previously trained models outdated. We present STAR-GO, a Transformer-based framework that jointly models the semantic and structural characteristics of GO terms to enhance zero-shot protein function prediction. STAR-GO integrates textual definitions with ontology graph structure to learn unified GO representations, which are processed in hierarchical order to propagate information from general to specific terms. These representations are then aligned with protein sequence embeddings to capture sequence-function relationships. STAR-GO achieves state-of-the-art performance and superior zero-shot generalization, demonstrating the utility of integrating semantics and structure for robust and adaptable protein function prediction. Code is available at https://github.com/boun-tabi-lifelu/stargo.
LGJul 25, 2023
EdgeConvEns: Convolutional Ensemble Learning for Edge IntelligenceIlkay Sikdokur, İnci M. Baytaş, Arda Yurdakul
Deep edge intelligence aims to deploy deep learning models that demand computationally expensive training in the edge network with limited computational power. Moreover, many deep edge intelligence applications require handling distributed data that cannot be transferred to a central server due to privacy concerns. Decentralized learning methods, such as federated learning, offer solutions where models are learned collectively by exchanging learned weights. However, they often require complex models that edge devices may not handle and multiple rounds of network communication to achieve state-of-the-art performances. This study proposes a convolutional ensemble learning approach, coined EdgeConvEns, that facilitates training heterogeneous weak models on edge and learning to ensemble them where data on edge are heterogeneously distributed. Edge models are implemented and trained independently on Field-Programmable Gate Array (FPGA) devices with various computational capacities. Learned data representations are transferred to a central server where the ensemble model is trained with the learned features received from the edge devices to boost the overall prediction performance. Extensive experiments demonstrate that the EdgeConvEns can outperform the state-of-the-art performance with fewer communications and less data in various training scenarios.