Cuneyt Gurcan Akcora

LG
h-index21
15papers
248citations
Novelty45%
AI Score50

15 Papers

LGNov 24, 2022Code
Reduction Algorithms for Persistence Diagrams of Networks: CoralTDA and PrunIT

Cuneyt Gurcan Akcora, Murat Kantarcioglu, Yulia R. Gel et al.

Topological data analysis (TDA) delivers invaluable and complementary information on the intrinsic properties of data inaccessible to conventional methods. However, high computational costs remain the primary roadblock hindering the successful application of TDA in real-world studies, particularly with machine learning on large complex networks. Indeed, most modern networks such as citation, blockchain, and online social networks often have hundreds of thousands of vertices, making the application of existing TDA methods infeasible. We develop two new, remarkably simple but effective algorithms to compute the exact persistence diagrams of large graphs to address this major TDA limitation. First, we prove that $(k+1)$-core of a graph $\mathcal{G}$ suffices to compute its $k^{th}$ persistence diagram, $PD_k(\mathcal{G})$. Second, we introduce a pruning algorithm for graphs to compute their persistence diagrams by removing the dominated vertices. Our experiments on large networks show that our novel approach can achieve computational gains up to 95%. The developed framework provides the first bridge between the graph theory and TDA, with applications in machine learning of large complex networks. Our implementation is available at https://github.com/cakcora/PersistentHomologyWithCoralPrunit

68.7SIApr 29
MoltGraph: A Longitudinal Temporal Graph Dataset of Moltbook for Coordinated-Agent Detection

Kunal Mukherjee, Cuneyt Gurcan Akcora, Murat Kantarcioglu

Agent-native social platforms such as Moltbook are rapidly emerging, yet they inherit and amplify classical influence and abuse attacks, where coordinated agents strategically comment and upvote to manipulate visibility and propagate narratives across communities. However, rigorous measurement and learning-based monitoring remain constrained by the absence of longitudinal, graph-native datasets for agentic social networks that jointly capture heterogeneous interactions, temporal drift, and visibility signals needed to connect coordination behavior to downstream exposure. We introduce MoltGraph as a realistic longitudinal agentic social-network graph dataset for studying how agents behave, coordinate, and evolve in the wild, enabling reproducible measurement on emerging multi-agent social ecosystems. Using MoltGraph, we provide the first graph-centric characterization of Moltbook as a dynamic network: (i) heavy-tailed connectivity with power-law exponents in the range alpha in [1.86, 2.72], (ii) accelerating hub formation and attention centralization where the top 1% agents account for 29.00% of engagements, (iii) bursty, short-lived coordination episodes, 98.33% last under 24 hours, and (iv) measurable exposure effects across submolts. In matched analyses, posts receiving coordinated engagement exhibit 506.35% higher early interaction rates (within H=5 days) and 242.63% higher downstream exposure in feeds than non-coordinated controls.

LGMar 24, 2023
Data Depth and Core-based Trend Detection on Blockchain Transaction Networks

Jason Zhu, Arijit Khan, Cuneyt Gurcan Akcora

Blockchains are significantly easing trade finance, with billions of dollars worth of assets being transacted daily. However, analyzing these networks remains challenging due to the sheer volume and complexity of the data. We introduce a method named InnerCore that detects market manipulators within blockchain-based networks and offers a sentiment indicator for these networks. This is achieved through data depth-based core decomposition and centered motif discovery, ensuring scalability. InnerCore is a computationally efficient, unsupervised approach suitable for analyzing large temporal graphs. We demonstrate its effectiveness by analyzing and detecting three recent real-world incidents from our datasets: the catastrophic collapse of LunaTerra, the Proof-of-Stake switch of Ethereum, and the temporary peg loss of USDC - while also verifying our results against external ground truth. Our experiments show that InnerCore can match the qualified analysis accurately without human involvement, automating blockchain analysis in a scalable manner, while being more effective and efficient than baselines and state-of-the-art attributed change detection approach in dynamic graphs.

30.8LGMay 7
Adversarial Graph Neural Network Benchmarks: Towards Practical and Fair Evaluation

Tran Gia Bao Ngo, Zulfikar Alom, Federico Errica et al.

Adversarial learning and the robustness of Graph Neural Networks (GNNs) are topics of widespread interest in the machine learning community, as documented by the number of adversarial attacks and defenses designed for these purposes. While a rigorous evaluation of these adversarial methods is necessary to understand the robustness of GNNs in real-world applications, we posit that many works in the literature do not share the same experimental settings, leading to ambiguous and potentially contradictory scientific conclusions. In this benchmark, we demonstrate the importance of adopting fair, robust, and standardized evaluation protocols in adversarial GNN research. We perform a comprehensive re-evaluation of seven widely used attacks and eight recent defenses under both poisoning and evasion scenarios, across six popular graph datasets. Our study spans over 453,000 experiments conducted within a unified framework. We observe substantial differences in adversarial attack performance when evaluated under a fair and robust procedure. Our findings reveal that previously overlooked factors, such as target node selection and the training process of the attacked model, have a profound impact on attack effectiveness, to the extent of completely distorting performance insights. These results underscore the urgent need for standardized evaluations in adversarial graph machine learning.

LGJan 30
Optimal Transport-Guided Adversarial Attacks on Graph Neural Network-Based Bot Detection

Kunal Mukherjee, Zulfikar Alom, Tran Gia Bao Ngo et al.

The rise of bot accounts on social media poses significant risks to public discourse. To address this threat, modern bot detectors increasingly rely on Graph Neural Networks (GNNs). However, the effectiveness of these GNN-based detectors in real-world settings remains poorly understood. In practice, attackers continuously adapt their strategies as well as must operate under domain-specific and temporal constraints, which can fundamentally limit the applicability of existing attack methods. As a result, there is a critical need for robust GNN-based bot detection methods under realistic, constraint-aware attack scenarios. To address this gap, we introduce BOCLOAK to systematically evaluate the robustness of GNN-based social bot detection via both edge editing and node injection adversarial attacks under realistic constraints. BOCLOAK constructs a probability measure over spatio-temporal neighbor features and learns an optimal transport geometry that separates human and bot behaviors. It then decodes transport plans into sparse, plausible edge edits that evade detection while obeying real-world constraints. We evaluate BOCLOAK across three social bot datasets, five state-of-the-art bot detectors, three adversarial defenses, and compare it against four leading graph adversarial attack baselines. BOCLOAK achieves up to 80.13% higher attack success rates while using 99.80% less GPU memory under realistic real-world constraints. Most importantly, BOCLOAK shows that optimal transport provides a lightweight, principled framework for bridging the gap between adversarial attacks and real-world bot detection.

LGFeb 5
ATEX-CF: Attack-Informed Counterfactual Explanations for Graph Neural Networks

Yu Zhang, Sean Bin Yang, Arijit Khan et al.

Counterfactual explanations offer an intuitive way to interpret graph neural networks (GNNs) by identifying minimal changes that alter a model's prediction, thereby answering "what must differ for a different outcome?". In this work, we propose a novel framework, ATEX-CF that unifies adversarial attack techniques with counterfactual explanation generation-a connection made feasible by their shared goal of flipping a node's prediction, yet differing in perturbation strategy: adversarial attacks often rely on edge additions, while counterfactual methods typically use deletions. Unlike traditional approaches that treat explanation and attack separately, our method efficiently integrates both edge additions and deletions, grounded in theory, leveraging adversarial insights to explore impactful counterfactuals. In addition, by jointly optimizing fidelity, sparsity, and plausibility under a constrained perturbation budget, our method produces instance-level explanations that are both informative and realistic. Experiments on synthetic and real-world node classification benchmarks demonstrate that ATEX-CF generates faithful, concise, and plausible explanations, highlighting the effectiveness of integrating adversarial insights into counterfactual reasoning for GNNs.

CRApr 28, 2024
Machine Learning for Blockchain Data Analysis: Progress and Opportunities

Poupak Azad, Cuneyt Gurcan Akcora, Arijit Khan

Blockchain technology has rapidly emerged to mainstream attention, while its publicly accessible, heterogeneous, massive-volume, and temporal data are reminiscent of the complex dynamics encountered during the last decade of big data. Unlike any prior data source, blockchain datasets encompass multiple layers of interactions across real-world entities, e.g., human users, autonomous programs, and smart contracts. Furthermore, blockchain's integration with cryptocurrencies has introduced financial aspects of unprecedented scale and complexity such as decentralized finance, stablecoins, non-fungible tokens, and central bank digital currencies. These unique characteristics present both opportunities and challenges for machine learning on blockchain data. On one hand, we examine the state-of-the-art solutions, applications, and future directions associated with leveraging machine learning for blockchain data analysis critical for the improvement of blockchain technology such as e-crime detection and trends prediction. On the other hand, we shed light on the pivotal role of blockchain by providing vast datasets and tools that can catalyze the growth of the evolving machine learning ecosystem. This paper serves as a comprehensive resource for researchers, practitioners, and policymakers, offering a roadmap for navigating this dynamic and transformative field.

LGJan 8, 2024
Explaining the Power of Topological Data Analysis in Graph Machine Learning

Funmilola Mary Taiwo, Umar Islambekov, Cuneyt Gurcan Akcora

Topological Data Analysis (TDA) has been praised by researchers for its ability to capture intricate shapes and structures within data. TDA is considered robust in handling noisy and high-dimensional datasets, and its interpretability is believed to promote an intuitive understanding of model behavior. However, claims regarding the power and usefulness of TDA have only been partially tested in application domains where TDA-based models are compared to other graph machine learning approaches, such as graph neural networks. We meticulously test claims on TDA through a comprehensive set of experiments and validate their merits. Our results affirm TDA's robustness against outliers and its interpretability, aligning with proponents' arguments. However, we find that TDA does not significantly enhance the predictive power of existing methods in our specific experiments, while incurring significant computational costs. We investigate phenomena related to graph characteristics, such as small diameters and high clustering coefficients, to mitigate the computational expenses of TDA computations. Our results offer valuable perspectives on integrating TDA into graph machine learning tasks.

CROct 16, 2024
SoK: Prompt Hacking of Large Language Models

Baha Rababah, Shang, Wu et al.

The safety and robustness of large language models (LLMs) based applications remain critical challenges in artificial intelligence. Among the key threats to these applications are prompt hacking attacks, which can significantly undermine the security and reliability of LLM-based systems. In this work, we offer a comprehensive and systematic overview of three distinct types of prompt hacking: jailbreaking, leaking, and injection, addressing the nuances that differentiate them despite their overlapping characteristics. To enhance the evaluation of LLM-based applications, we propose a novel framework that categorizes LLM responses into five distinct classes, moving beyond the traditional binary classification. This approach provides more granular insights into the AI's behavior, improving diagnostic precision and enabling more targeted enhancements to the system's safety and robustness.

LGJun 14, 2024
MiNT: Multi-Network Training for Transfer Learning on Temporal Graphs

Kiarash Shamsi, Tran Gia Bao Ngo, Razieh Shirzadkhani et al.

Temporal Graph Learning (TGL) has become a robust framework for discovering patterns in dynamic networks and predicting future interactions. While existing research has largely concentrated on learning from individual networks, this study explores the potential of learning from multiple temporal networks and its ability to transfer to unobserved networks. To achieve this, we introduce Temporal Multi-network Training MiNT, a novel pre-training approach that learns from multiple temporal networks. With a novel collection of 84 temporal transaction networks, we pre-train TGL models on up to 64 networks and assess their transferability to 20 unseen networks. Remarkably, MiNT achieves state-of-the-art results in zero-shot inference, surpassing models individually trained on each network. Our findings further demonstrate that increasing the number of pre-training networks significantly improves transfer performance. This work lays the groundwork for developing Temporal Graph Foundation Models, highlighting the significant potential of multi-network pre-training in TGL.

CRMay 18, 2023
Chainlet Orbits: Topological Address Embedding for the Bitcoin Blockchain

Poupak Azad, Baris Coskunuzer, Murat Kantarcioglu et al.

The rise of cryptocurrencies like Bitcoin, which enable transactions with a degree of pseudonymity, has led to a surge in various illicit activities, including ransomware payments and transactions on darknet markets. These illegal activities often utilize Bitcoin as the preferred payment method. However, current tools for detecting illicit behavior either rely on a few heuristics and laborious data collection processes or employ computationally inefficient graph neural network (GNN) models that are challenging to interpret. To overcome the computational and interpretability limitations of existing techniques, we introduce an effective solution called Chainlet Orbits. This approach embeds Bitcoin addresses by leveraging their topological characteristics in transactions. By employing our innovative address embedding, we investigate e-crime in Bitcoin networks by focusing on distinctive substructures that arise from illicit behavior. The results of our node classification experiments demonstrate superior performance compared to state-of-the-art methods, including both topological and GNN-based approaches. Moreover, our approach enables the use of interpretable and explainable machine learning models in as little as 15 minutes for most days on the Bitcoin transaction network.

LGApr 10, 2021
Smart Vectorizations for Single and Multiparameter Persistence

Baris Coskunuzer, CUneyt Gurcan Akcora, Ignacio Segovia Dominguez et al.

The machinery of topological data analysis becomes increasingly popular in a broad range of machine learning tasks, ranging from anomaly detection and manifold learning to graph classification. Persistent homology is one of the key approaches here, allowing us to systematically assess the evolution of various hidden patterns in the data as we vary a scale parameter. The extracted patterns, or homological features, along with information on how long such features persist throughout the considered filtration of a scale parameter, convey a critical insight into salient data characteristics and data organization. In this work, we introduce two new and easily interpretable topological summaries for single and multi-parameter persistence, namely, saw functions and multi-persistence grid functions, respectively. Compared to the existing topological summaries which tend to assess the numbers of topological features and/or their lifespans at a given filtration step, our proposed saw and multi-persistence grid functions allow us to explicitly account for essential complementary information such as the numbers of births and deaths at each filtration step. These new topological summaries can be regarded as the complexity measures of the evolving subspaces determined by the filtration and are of particular utility for applications of persistent homology on graphs. We derive theoretical guarantees on the stability of the new saw and multi-persistence grid functions and illustrate their applicability for graph classification tasks.

CRMar 15, 2021
Blockchain Networks: Data Structures of Bitcoin, Monero, Zcash, Ethereum, Ripple and Iota

Cuneyt Gurcan Akcora, Murat Kantarcioglu, Yulia R. Gel

Blockchain is an emerging technology that has enabled many applications, from cryptocurrencies to digital asset management and supply chains. Due to this surge of popularity, analyzing the data stored on blockchains poses a new critical challenge in data science. To assist data scientists in various analytic tasks on a blockchain, in this tutorial, we provide a systematic and comprehensive overview of the fundamental elements of blockchain network models. We discuss how we can abstract blockchain data as various types of networks and further use such associated network abstractions to reap important insights on blockchains' structure, organization, and functionality.

LGAug 18, 2019
ChainNet: Learning on Blockchain Graphs with Topological Features

Nazmiye Ceren Abay, Cuneyt Gurcan Akcora, Yulia R. Gel et al.

With emergence of blockchain technologies and the associated cryptocurrencies, such as Bitcoin, understanding network dynamics behind Blockchain graphs has become a rapidly evolving research direction. Unlike other financial networks, such as stock and currency trading, blockchain based cryptocurrencies have the entire transaction graph accessible to the public (i.e., all transactions can be downloaded and analyzed). A natural question is then to ask whether the dynamics of the transaction graph impacts the price of the underlying cryptocurrency. We show that standard graph features such as degree distribution of the transaction graph may not be sufficient to capture network dynamics and its potential impact on fluctuations of Bitcoin price. In contrast, the new graph associated topological features computed using the tools of persistent homology, are found to exhibit a high utility for predicting Bitcoin price dynamics. %explain higher order interactions among the nodes in Blockchain graphs and can be used to build much more accurate price prediction models. Using the proposed persistent homology-based techniques, we offer a new elegant, easily extendable and computationally light approach for graph representation learning on Blockchain.

CRJun 19, 2019
BitcoinHeist: Topological Data Analysis for Ransomware Detection on the Bitcoin Blockchain

Cuneyt Gurcan Akcora, Yitao Li, Yulia R. Gel et al.

Proliferation of cryptocurrencies (e.g., Bitcoin) that allow pseudo-anonymous transactions, has made it easier for ransomware developers to demand ransom by encrypting sensitive user data. The recently revealed strikes of ransomware attacks have already resulted in significant economic losses and societal harm across different sectors, ranging from local governments to health care. Most modern ransomware use Bitcoin for payments. However, although Bitcoin transactions are permanently recorded and publicly available, current approaches for detecting ransomware depend only on a couple of heuristics and/or tedious information gathering steps (e.g., running ransomware to collect ransomware related Bitcoin addresses). To our knowledge, none of the previous approaches have employed advanced data analytics techniques to automatically detect ransomware related transactions and malicious Bitcoin addresses. By capitalizing on the recent advances in topological data analysis, we propose an efficient and tractable data analytics framework to automatically detect new malicious addresses in a ransomware family, given only a limited records of previous transactions. Furthermore, our proposed techniques exhibit high utility to detect the emergence of new ransomware families, that is, ransomware with no previous records of transactions. Using the existing known ransomware data sets, we show that our proposed methodology provides significant improvements in precision and recall for ransomware transaction detection, compared to existing heuristic based approaches, and can be utilized to automate ransomware detection.