CyNER: A Python Library for Cybersecurity Named Entity Recognition
This provides a tool for cybersecurity professionals to process unstructured threat data, though it is incremental as it builds on existing models and ontologies.
The authors tackled the problem of unstructured cybersecurity threat intelligence by developing CyNER, an open-source Python library for cybersecurity named entity recognition, which combines transformer-based models, heuristics, and public NER models to extract entities and indicators of compromise from diverse sources.
Open Cyber threat intelligence (OpenCTI) information is available in an unstructured format from heterogeneous sources on the Internet. We present CyNER, an open-source python library for cybersecurity named entity recognition (NER). CyNER combines transformer-based models for extracting cybersecurity-related entities, heuristics for extracting different indicators of compromise, and publicly available NER models for generic entity types. We provide models trained on a diverse corpus that users can readily use. Events are described as classes in previous research - MALOnt2.0 (Christian et al., 2021) and MALOnt (Rastogi et al., 2020) and together extract a wide range of malware attack details from a threat intelligence corpus. The user can combine predictions from multiple different approaches to suit their needs. The library is made publicly available.