Tijana Milenković

h-index33

6papers

49citations

Novelty46%

AI Score40

Ranked #76,036 of 194,257 authors (top 39%)#16,981 in LG (top 42%)

6 Papers

5.7LGMay 28

Traditional machine learning vs. deep learning from dynamic graph representations of proteins' 3D folds in the task of protein structure classification

Aydin Wells, Francis A. Gatsi, Aaron Striegel et al.

Protein structure classification (PSC) uses supervised learning to predict a protein's CATH/SCOP(e) class from the protein's sequence or 3D structural feature(s). We already modeled 3D structures as (static) protein structure networks (PSNs), demonstrating the competitiveness of PSN-based features to sequence or direct (i.e. non-network) 3D structural features in the PSC task. More recently, we demonstrated the power of features extracted from dynamic PSNs over features extracted from static PSNs (and thus by transitivity over sequence and direct 3D structural features) in the same task. That dynamic PSN approach used traditional machine learning (ML), combining manual (pre-engineered) features with an off-the-shelf classifier. Here, we evaluate whether automatic deep learning (DL) from the dynamic PSNs yields improvements. Our evaluation on 72 datasets spanning ~44,000 CATH- or SCOPe-labeled dynamic PSNs reveals that in terms of PSC accuracy, traditional ML and DL are (close to) tied for a large majority of the datasets, while DL is on average 10+ times slower. We are the first to evaluate traditional ML vs. DL in the dynamic PSN-based PSC task.

1.2MLOct 7, 2019

Weighted graphlets and deep neural networks for protein structure classification

Hongyu Guo, Khalique Newaz, Scott Emrich et al.

As proteins with similar structures often have similar functions, analysis of protein structures can help predict protein functions and is thus important. We consider the problem of protein structure classification, which computationally classifies the structures of proteins into pre-defined groups. We develop a weighted network that depicts the protein structures, and more importantly, we propose the first graphlet-based measure that applies to weighted networks. Further, we develop a deep neural network (DNN) composed of both convolutional and recurrent layers to use this measure for classification. Put together, our approach shows dramatic improvements in performance over existing graphlet-based approaches on 36 real datasets. Even comparing with the state-of-the-art approach, it almost halves the classification error. In addition to protein structure networks, our weighted-graphlet measure and DNN classifier can potentially be applied to classification of other weighted networks in computational biology as well as in other domains.

5.1SIAug 6, 2019

The power of dynamic social networks to predict individuals' mental health

Shikang Liu, David Hachen, Omar Lizardo et al.

Precision medicine has received attention both in and outside the clinic. We focus on the latter, by exploiting the relationship between individuals' social interactions and their mental health to develop a predictive model of one's likelihood to be depressed or anxious from rich dynamic social network data. To our knowledge, we are the first to do this. Existing studies differ from our work in at least one aspect: they do not model social interaction data as a network; they do so but analyze static network data; they examine "correlation" between social networks and health but without developing a predictive model; or they study other individual traits but not mental health. In a systematic and comprehensive evaluation, we show that our predictive model that uses dynamic social network data is superior to its static network as well as non-network equivalents when run on the same data.

4.3SIJun 11, 2019

Heterogeneous network approach to predict individuals' mental health

Shikang Liu, Fatemeh Vahedian, David Hachen et al.

Depression and anxiety are critical public health issues affecting millions of people around the world. To identify individuals who are vulnerable to depression and anxiety, predictive models have been built that typically utilize data from one source. Unlike these traditional models, in this study, we leverage a rich heterogeneous data set from the University of Notre Dame's NetHealth study that collected individuals' (student participants') social interaction data via smartphones, health-related behavioral data via wearables (Fitbit), and trait data from surveys. To integrate the different types of information, we model the NetHealth data as a heterogeneous information network (HIN). Then, we redefine the problem of predicting individuals' mental health conditions (depression or anxiety) in a novel manner, as applying to our HIN a popular paradigm of a recommender system (RS), which is typically used to predict the preference that a person would give to an item (e.g., a movie or book). In our case, the items are the individuals' different mental health states. We evaluate four state-of-the-art RS approaches. Also, we model the prediction of individuals' mental health as another problem type - that of node classification (NC) in our HIN, evaluating in the process four node features under logistic regression as a proof-of-concept classifier. We find that our RS and NC network methods produce more accurate predictions than a logistic regression model using the same NetHealth data in the traditional non-network fashion as well as a random-approach. Also, we find that the best of the considered RS approaches outperforms all considered NC approaches. This is the first study to integrate smartphone, wearable sensor, and survey data in an HIN manner and use RS or NC on the HIN to predict individuals' mental health conditions.

0.8LGAug 24, 2018

GoT-WAVE: Temporal network alignment using graphlet-orbit transitions

David Aparício, Pedro Ribeiro, Tijana Milenković et al.

Global pairwise network alignment (GPNA) aims to find a one-to-one node mapping between two networks that identifies conserved network regions. GPNA algorithms optimize node conservation (NC) and edge conservation (EC). NC quantifies topological similarity between nodes. Graphlet-based degree vectors (GDVs) are a state-of-the-art topological NC measure. Dynamic GDVs (DGDVs) were used as a dynamic NC measure within the first-ever algorithms for GPNA of temporal networks: DynaMAGNA++ and DynaWAVE. The latter is superior for larger networks. We recently developed a different graphlet-based measure of temporal node similarity, graphlet-orbit transitions (GoTs). Here, we use GoTs instead of DGDVs as a new dynamic NC measure within DynaWAVE, resulting in a new approach, GoT-WAVE. On synthetic networks, GoT-WAVE improves DynaWAVE's accuracy by 25% and speed by 64%. On real networks, when optimizing only dynamic NC, each method is superior ~50% of the time. While DynaWAVE benefits more from also optimizing dynamic EC, only GoT-WAVE can support directed edges. Hence, GoT-WAVE is a promising new temporal GPNA algorithm, which efficiently optimizes dynamic NC. Future work on better incorporating dynamic EC may yield further improvements.

5.9MNApr 12, 2018

Network-based protein structural classification

Khalique Newaz, Mahboobeh Ghalehnovi, Arash Rahnama et al.

Experimental determination of protein function is resource-consuming. As an alternative, computational prediction of protein function has received attention. In this context, protein structural classification (PSC) can help, by allowing for determining structural classes of currently unclassified proteins based on their features, and then relying on the fact that proteins with similar structures have similar functions. Existing PSC approaches rely on sequence-based or direct 3-dimensional (3D) structure-based protein features. In contrast, we first model 3D structures of proteins as protein structure networks (PSNs). Then, we use network-based features for PSC. We propose the use of graphlets, state-of-the-art features in many research areas of network science, in the task of PSC. Moreover, because graphlets can deal only with unweighted PSNs, and because accounting for edge weights when constructing PSNs could improve PSC accuracy, we also propose a deep learning framework that automatically learns network features from weighted PSNs. When evaluated on a large set of ~9,400 CATH and ~12,800 SCOP protein domains (spanning 36 PSN sets), our proposed approaches are superior to existing PSC approaches in terms of accuracy, with comparable running time.