CRSep 14, 2021
What are the attackers doing now? Automating cyber threat intelligence extraction from text on pace with the changing threat landscape: A surveyMd Rayhanur Rahman, Rezvan Mahdavi-Hezaveh, Laurie Williams
Cybersecurity researchers have contributed to the automated extraction of CTI from textual sources, such as threat reports and online articles, where cyberattack strategies, procedures, and tools are described. The goal of this article is to aid cybersecurity researchers understand the current techniques used for cyberthreat intelligence extraction from text through a survey of relevant studies in the literature. We systematically collect "CTI extraction from text"-related studies from the literature and categorize the CTI extraction purposes. We propose a CTI extraction pipeline abstracted from these studies. We identify the data sources, techniques, and CTI sharing formats utilized in the context of the proposed pipeline. Our work finds ten types of extraction purposes, such as extraction indicators of compromise extraction, TTPs (tactics, techniques, procedures of attack), and cybersecurity keywords. We also identify seven types of textual sources for CTI extraction, and textual data obtained from hacker forums, threat reports, social media posts, and online news articles have been used by almost 90% of the studies. Natural language processing along with both supervised and unsupervised machine learning techniques such as named entity recognition, topic modelling, dependency parsing, supervised classification, and clustering are used for CTI extraction. We observe the technical challenges associated with these studies related to obtaining available clean, labelled data which could assure replication, validation, and further extension of the studies. As we find the studies focusing on CTI information extraction from text, we advocate for building upon the current CTI extraction work to help cybersecurity practitioners with proactive decision making such as threat prioritization, automated threat modelling to utilize knowledge from past cybersecurity incidents.
SEJul 14, 2019
Software Development with Feature Toggles: Practices used by PractitionersRezvan Mahdavi-Hezaveh, Jacob Dremann, Laurie Williams
Background: Using feature toggles is a technique that allows developers to either turn a feature on or off with a variable in a conditional statement. Feature toggles are increasingly used by software companies to facilitate continuous integration and continuous delivery. However, using feature toggles inappropriately may cause problems which can have a severe impact, such as code complexity, dead code, and system failure. For example, the erroneous repurposing of an old feature toggle caused Knight Capital Group, an American global financial services firm, to go bankrupt due to the implications of the resultant incorrect system behavior. Aim: The goal of this research project is to aid software practitioners in the use of practices to support software development with feature toggles through an empirical study of feature toggle practice usage by practitioners. Method: We conducted a qualitative analysis of 99 artifacts from the grey literature and 10 peer-reviewed papers about feature toggles. We conducted a survey of practitioners from 38 companies. Results: We identified 17 practices in 4 categories: Management practices, Initialization practices, Implementation practices, and Clean-up practices. We observed that all of the survey respondents use a dedicated tool to create and manage feature toggles in their code. Documenting feature toggle's metadata, setting up the default value for feature toggles, and logging the changes made on feature toggles are also frequently-observed practices. Conclusions: The feature toggle development practices discovered and enumerated in this work can help practitioners more effectively use feature toggles. This work can enable future mining of code repositories to automatically identify feature toggle practices.
SEJul 13, 2018
Where Are The Gaps? A Systematic Mapping Study of Infrastructure as Code ResearchAkond Rahman, Rezvan Mahdavi-Hezaveh, Laurie Williams
Context:Infrastructure as code (IaC) is the practice to automatically configure system dependencies and to provision local and remote instances. Practitioners consider IaC as a fundamental pillar to implement DevOps practices, which helps them to rapidly deliver software and services to end-users. Information technology (IT) organizations, such as Github, Mozilla, Facebook, Google and Netflix have adopted IaC. A systematic mapping study on existing IaC research can help researchers to identify potential research areas related to IaC, for example, the areas of defects and security flaws that may occur in IaC scripts. Objective: The objective of this paper is to help researchers identify research areas related to infrastructure as code (IaC) by conducting a systematic mapping study of IaC-related research. Methodology: We conduct our research study by searching six scholar databases. We collect a set of 33,887 publications by using seven search strings. By systematically applying inclusion and exclusion criteria, we identify 31 publications related to IaC. We identify topics addressed in these publications by applying qualitative analysis. Results: We identify four topics studied in IaC-related publications: (i) framework/tool for infrastructure as code; (ii) use of infrastructure as code; (iii) empirical study related to infrastructure as code; and (iv) testing in infrastructure as code. According to our analysis, 52% of the studied 31 publications propose a framework or tool to implement the practice of IaC or extend the functionality of an existing IaC tool. Conclusion: As defects and security flaws can have serious consequences for the deployment and development environments in DevOps, along with other topics, we observe the need for research studies that will study defects and security flaws for IaC.