IRCLAug 21, 2013

PACE: Pattern Accurate Computationally Efficient Bootstrapping for Timely Discovery of Cyber-Security Concepts

arXiv:1308.4648v341 citations
Originality Incremental advance
AI Analysis

This addresses the need for timely discovery of security vulnerabilities and exploits for cybersecurity professionals, but it appears incremental as it builds on traditional bootstrapping methods.

The paper tackles the problem of delayed classification of cyber-security information from online sources by proposing PACE, a semi-supervised learning algorithm that enhances bootstrapping for entity extraction with a time-memory trade-off, aiming to increase accuracy without costly corpus searches.

Public disclosure of important security information, such as knowledge of vulnerabilities or exploits, often occurs in blogs, tweets, mailing lists, and other online sources months before proper classification into structured databases. In order to facilitate timely discovery of such knowledge, we propose a novel semi-supervised learning algorithm, PACE, for identifying and classifying relevant entities in text sources. The main contribution of this paper is an enhancement of the traditional bootstrapping method for entity extraction by employing a time-memory trade-off that simultaneously circumvents a costly corpus search while strengthening pattern nomination, which should increase accuracy. An implementation in the cyber-security domain is discussed as well as challenges to Natural Language Processing imposed by the security domain.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes