Anton Kocheturov

CR
h-index3
6papers
85citations
Novelty48%
AI Score47

6 Papers

30.1CRApr 30Code
Characterizing and Modeling the GitHub Security Advisories Review Pipeline

Claudio Segal, Paulo Segal, Carlos Eduardo Banjar et al.

GitHub Security Advisories (GHSA) have become a central component of open-source vulnerability disclosure and are widely used by developers and security tools. A distinctive feature of GHSA is that only a fraction of advisories are reviewed by GitHub, while the mechanisms associated with this review process remain poorly understood. In this paper, we conduct a large-scale empirical study of the GHSA review processes, analyzing over 288,000 advisories spanning 2019-2025. We characterize which advisories are more likely to be reviewed, quantify review delays, and identify two distinct review-latency regimes: a fast path dominated by GitHub Repository Advisories (GRAs) and a slow path dominated by NVD-first advisories. We further develop a queueing model that accounts for this dichotomy based on the structure of the advisory processing pipeline.

CRAug 3, 2023
Cream Skimming the Underground: Identifying Relevant Information Points from Online Forums

Felipe Moreno-Vera, Mateus Nogueira, Cainã Figueiredo et al.

This paper proposes a machine learning-based approach for detecting the exploitation of vulnerabilities in the wild by monitoring underground hacking forums. The increasing volume of posts discussing exploitation in the wild calls for an automatic approach to process threads and posts that will eventually trigger alarms depending on their content. To illustrate the proposed system, we use the CrimeBB dataset, which contains data scraped from multiple underground forums, and develop a supervised machine learning model that can filter threads citing CVEs and label them as Proof-of-Concept, Weaponization, or Exploitation. Leveraging random forests, we indicate that accuracy, precision and recall above 0.99 are attainable for the classification task. Additionally, we provide insights into the difference in nature between weaponization and exploitation, e.g., interpreting the output of a decision tree, and analyze the profits and other aspects related to the hacking communities. Overall, our work sheds insight into the exploitation of vulnerabilities in the wild and can be used to provide additional ground truth to models such as EPSS and Expected Exploitability.

10.9CRMar 26
An Approach to Generate Attack Graphs with a Case Study on Siemens PCS7 Blueprint for Water Treatment Plants

Lucas Miranda, Carlos Banjar, Daniel Menasche et al.

Assessing the security posture of Industrial Control Systems (ICS) is critical for protecting essential infrastructure. However, the complexity and scale of these environments make it challenging to identify and prioritize potential attack paths. This paper introduces a semi-automated approach for generating attack graphs in ICS environments to visualize and analyze multi-step attack scenarios. Our methodology integrates network topology information with vulnerability data to construct a model of the system. This model is then processed by a stateful traversal algorithm to identify potential exploit chains based on preconditions and consequences. We present a case study applying the proposed framework to the Siemens PCS7 Cybersecurity Blueprint for Water Treatment Plants. The results demonstrate the framework's ability to simulate different attack scenarios, including those originating from known CVEs and potential device misconfigurations. We show how a single point of failure can compromise network segmentation and how patching a critical vulnerability can protect an entire security zone, providing actionable insights for risk mitigation.

SESep 29, 2025
AGNOMIN -- Architecture Agnostic Multi-Label Function Name Prediction

Yonatan Gizachew Achamyeleh, Tongtao Zhang, Joshua Hyunki Kim et al.

Function name prediction is crucial for understanding stripped binaries in software reverse engineering, a key step for \textbf{enabling subsequent vulnerability analysis and patching}. However, existing approaches often struggle with architecture-specific limitations, data scarcity, and diverse naming conventions. We present AGNOMIN, a novel architecture-agnostic approach for multi-label function name prediction in stripped binaries. AGNOMIN builds Feature-Enriched Hierarchical Graphs (FEHGs), combining Control Flow Graphs, Function Call Graphs, and dynamically learned \texttt{PCode} features. A hierarchical graph neural network processes this enriched structure to generate consistent function representations across architectures, vital for \textbf{scalable security assessments}. For function name prediction, AGNOMIN employs a Renée-inspired decoder, enhanced with an attention-based head layer and algorithmic improvements. We evaluate AGNOMIN on a comprehensive dataset of 9,000 ELF executable binaries across three architectures, demonstrating its superior performance compared to state-of-the-art approaches, with improvements of up to 27.17\% in precision and 55.86\% in recall across the testing dataset. Moreover, AGNOMIN generalizes well to unseen architectures, achieving 5.89\% higher recall than the closest baseline. AGNOMIN's practical utility has been validated through security hackathons, where it successfully aided reverse engineers in analyzing and patching vulnerable binaries across different architectures.

LGMar 5, 2021
NF-GNN: Network Flow Graph Neural Networks for Malware Detection and Classification

Julian Busch, Anton Kocheturov, Volker Tresp et al.

Malicious software (malware) poses an increasing threat to the security of communication systems as the number of interconnected mobile devices increases exponentially. While some existing malware detection and classification approaches successfully leverage network traffic data, they treat network flows between pairs of endpoints independently and thus fail to leverage rich communication patterns present in the complete network. Our approach first extracts flow graphs and subsequently classifies them using a novel edge feature-based graph neural network model. We present three variants of our base model, which support malware detection and classification in supervised and unsupervised settings. We evaluate our approach on flow graphs that we extract from a recently published dataset for mobile malware detection that addresses several issues with previously available datasets. Experiments on four different prediction tasks consistently demonstrate the advantages of our approach and show that our graph neural network model can boost detection performance by a significant margin.

LGApr 26, 2018
Extended Vertical Lists for Temporal Pattern Mining from Multivariate Time Series

Anton Kocheturov, Petar Momcilovic, Azra Bihorac et al.

Temporal Pattern Mining (TPM) is the problem of mining predictive complex temporal patterns from multivariate time series in a supervised setting. We develop a new method called the Fast Temporal Pattern Mining with Extended Vertical Lists. This method utilizes an extension of the Apriori property which requires a more complex pattern to appear within records only at places where all of its subpatterns are detected as well. The approach is based on a novel data structure called the Extended Vertical List that tracks positions of the first state of the pattern inside records. Extensive computational results indicate that the new method performs significantly faster than the previous version of the algorithm for TMP. However, the speed-up comes at the expense of memory usage.