CRIRFeb 10, 2021

Malware Knowledge Graph Generation

arXiv:2102.05583v15 citationsHas Code
Originality Incremental advance
AI Analysis

This addresses the lack of open-source knowledge graphs in the security domain, enabling more efficient threat intelligence extraction without relying heavily on security experts.

The authors tackled the problem of unstructured cyber threat information by building TINKER, a knowledge graph for threat intelligence, using RDF triples from 83 threat reports published between 2006-2021, resulting in a structured representation for downstream tasks like predicting missing information and future threats.

Cyber threat and attack intelligence information are available in non-standard format from heterogeneous sources. Comprehending them and utilizing them for threat intelligence extraction requires engaging security experts. Knowledge graphs enable converting this unstructured information from heterogeneous sources into a structured representation of data and factual knowledge for several downstream tasks such as predicting missing information and future threat trends. Existing large-scale knowledge graphs mainly focus on general classes of entities and relationships between them. Open-source knowledge graphs for the security domain do not exist. To fill this gap, we've built \textsf{TINKER} - a knowledge graph for threat intelligence (\textbf{T}hreat \textbf{IN}telligence \textbf{K}nowl\textbf{E}dge g\textbf{R}aph). \textsf{TINKER} is generated using RDF triples describing entities and relations from tokenized unstructured natural language text from 83 threat reports published between 2006-2021. We built \textsf{TINKER} using classes and properties defined by open-source malware ontology and using hand-annotated RDF triples. We also discuss ongoing research and challenges faced while creating \textsf{TINKER}.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes