LGFeb 14, 2025
SGS-GNN: A Supervised Graph Sparsification method for Graph Neural NetworksSiddhartha Shankar Das, Naheed Anjum Arafat, Muftiqur Rahman et al.
We propose SGS-GNN, a novel supervised graph sparsifier that learns the sampling probability distribution of edges and samples sparse subgraphs of a user-specified size to reduce the computational costs required by GNNs for inference tasks on large graphs. SGS-GNN employs regularizers in the loss function to enhance homophily in sparse subgraphs, boosting the accuracy of GNNs on heterophilic graphs, where a significant number of the neighbors of a node have dissimilar labels. SGS-GNN also supports conditional updates of the probability distribution learning module based on a prior, which helps narrow the search space for sparse graphs. SGS-GNN requires fewer epochs to obtain high accuracies since it learns the search space of subgraphs more effectively than methods using fixed distributions such as random sampling. Extensive experiments using 33 homophilic and heterophilic graphs demonstrate the following: (i) with only 20% of edges retained in the sparse subgraphs, SGS-GNN improves the F1-scores by a geometric mean of 4% relative to the original graph; on heterophilic graphs, the prediction accuracy is better up to 30%. (ii) SGS-GNN outperforms state-of-the-art methods with improvement in F1-scores of 4-7% in geometric mean with similar sparsities in the sampled subgraphs, and (iii) compared to sparsifiers that employ fixed distributions, SGS-GNN requires about half the number of epochs to converge.
LGFeb 23, 2021
V2W-BERT: A Framework for Effective Hierarchical Multiclass Classification of Software VulnerabilitiesSiddhartha Shankar Das, Edoardo Serra, Mahantesh Halappanavar et al.
Weaknesses in computer systems such as faults, bugs and errors in the architecture, design or implementation of software provide vulnerabilities that can be exploited by attackers to compromise the security of a system. Common Weakness Enumerations (CWE) are a hierarchically designed dictionary of software weaknesses that provide a means to understand software flaws, potential impact of their exploitation, and means to mitigate these flaws. Common Vulnerabilities and Exposures (CVE) are brief low-level descriptions that uniquely identify vulnerabilities in a specific product or protocol. Classifying or mapping of CVEs to CWEs provides a means to understand the impact and mitigate the vulnerabilities. Since manual mapping of CVEs is not a viable option, automated approaches are desirable but challenging. We present a novel Transformer-based learning framework (V2W-BERT) in this paper. By using ideas from natural language processing, link prediction and transfer learning, our method outperforms previous approaches not only for CWE instances with abundant data to train, but also rare CWE classes with little or no data to train. Our approach also shows significant improvements in using historical data to predict links for future instances of CVEs, and therefore, provides a viable approach for practical applications. Using data from MITRE and National Vulnerability Database, we achieve up to 97% prediction accuracy for randomly partitioned data and up to 94% prediction accuracy in temporally partitioned data. We believe that our work will influence the design of better methods and training models, as well as applications to solve increasingly harder problems in cybersecurity.