CRJul 5, 2025Code
False Alarms, Real Damage: Adversarial Attacks Using LLM-based Models on Text-based Cyber Threat Intelligence SystemsSamaneh Shafee, Alysson Bessani, Pedro M. Ferreira
Cyber Threat Intelligence (CTI) has emerged as a vital complementary approach that operates in the early phases of the cyber threat lifecycle. CTI involves collecting, processing, and analyzing threat data to provide a more accurate and rapid understanding of cyber threats. Due to the large volume of data, automation through Machine Learning (ML) and Natural Language Processing (NLP) models is essential for effective CTI extraction. These automated systems leverage Open Source Intelligence (OSINT) from sources like social networks, forums, and blogs to identify Indicators of Compromise (IoCs). Although prior research has focused on adversarial attacks on specific ML models, this study expands the scope by investigating vulnerabilities within various components of the entire CTI pipeline and their susceptibility to adversarial attacks. These vulnerabilities arise because they ingest textual inputs from various open sources, including real and potentially fake content. We analyse three types of attacks against CTI pipelines, including evasion, flooding, and poisoning, and assess their impact on the system's information selection capabilities. Specifically, on fake text generation, the work demonstrates how adversarial text generation techniques can create fake cybersecurity and cybersecurity-like text that misleads classifiers, degrades performance, and disrupts system functionality. The focus is primarily on the evasion attack, as it precedes and enables flooding and poisoning attacks within the CTI pipeline.
CRJan 26, 2024Code
Evaluation of LLM Chatbots for OSINT-based Cyber Threat AwarenessSamaneh Shafee, Alysson Bessani, Pedro M. Ferreira
Knowledge sharing about emerging threats is crucial in the rapidly advancing field of cybersecurity and forms the foundation of Cyber Threat Intelligence (CTI). In this context, Large Language Models are becoming increasingly significant in the field of cybersecurity, presenting a wide range of opportunities. This study surveys the performance of ChatGPT, GPT4all, Dolly, Stanford Alpaca, Alpaca-LoRA, Falcon, and Vicuna chatbots in binary classification and Named Entity Recognition (NER) tasks performed using Open Source INTelligence (OSINT). We utilize well-established data collected in previous research from Twitter to assess the competitiveness of these chatbots when compared to specialized models trained for those tasks. In binary classification experiments, Chatbot GPT-4 as a commercial model achieved an acceptable F1 score of 0.94, and the open-source GPT4all model achieved an F1 score of 0.90. However, concerning cybersecurity entity recognition, all evaluated chatbots have limitations and are less effective. This study demonstrates the capability of chatbots for OSINT binary classification and shows that they require further improvement in NER to effectively replace specially trained models. Our results shed light on the limitations of the LLM chatbots when compared to specialized models, and can help researchers improve chatbots technology with the objective to reduce the required effort to integrate machine learning in OSINT-based CTI tools.
CRApr 3, 2019Code
Processing Tweets for Cybersecurity Threat AwarenessFernando Alves, Aurélien Bettini, Pedro M. Ferreira et al.
Receiving timely and relevant security information is crucial for maintaining a high-security level on an IT infrastructure. This information can be extracted from Open Source Intelligence published daily by users, security organisations, and researchers. In particular, Twitter has become an information hub for obtaining cutting-edge information about many subjects, including cybersecurity. This work proposes SYNAPSE, a Twitter-based streaming threat monitor that generates a continuously updated summary of the threat landscape related to a monitored infrastructure. Its tweet-processing pipeline is composed of filtering, feature extraction, binary classification, an innovative clustering strategy, and generation of Indicators of Compromise (IoCs). A quantitative evaluation considering all tweets from 80 accounts over more than 8 months (over 195.000 tweets), shows that our approach timely and successfully finds the majority of security-related tweets concerning an example IT infrastructure (true positive rate above 90%), incorrectly selects a small number of tweets as relevant (false positive rate under 10%), and summarises the results to very few IoCs per day. A qualitative evaluation of the IoCs generated by SYNAPSE demonstrates their relevance (based on the CVSS score and the availability of patches or exploits), and timeliness (based on threat disclosure dates from NVD).
LGApr 1, 2019Code
Cyberthreat Detection from Twitter using Deep Neural NetworksNuno Dionísio, Fernando Alves, Pedro M. Ferreira et al.
To be prepared against cyberattacks, most organizations resort to security information and event management systems to monitor their infrastructures. These systems depend on the timeliness and relevance of the latest updates, patches and threats provided by cyberthreat intelligence feeds. Open source intelligence platforms, namely social media networks such as Twitter, are capable of aggregating a vast amount of cybersecurity-related sources. To process such information streams, we require scalable and efficient tools capable of identifying and summarizing relevant information for specified assets. This paper presents the processing pipeline of a novel tool that uses deep neural networks to process cybersecurity information received from Twitter. A convolutional neural network identifies tweets containing security-related information relevant to assets in an IT infrastructure. Then, a bidirectional long short-term memory network extracts named entities from these tweets to form a security alert or to fill an indicator of compromise. The proposed pipeline achieves an average 94% true positive rate and 91% true negative rate for the classification task and an average F1-score of 92% for the named entity recognition task, across three case study infrastructures.
AIApr 17, 2024
A Survey on Semantic Modeling for Building Energy ManagementMiracle Aniakor, Vinicius V. Cogo, Pedro M. Ferreira
Buildings account for a substantial portion of global energy consumption. Reducing buildings' energy usage primarily involves obtaining data from building systems and environment, which are instrumental in assessing and optimizing the building's performance. However, as devices from various manufacturers represent their data in unique ways, this disparity introduces challenges for semantic interoperability and creates obstacles in developing scalable building applications. This survey explores the leading semantic modeling techniques deployed for energy management in buildings. Furthermore, it aims to offer tangible use cases for applying semantic models, shedding light on the pivotal concepts and limitations intrinsic to each model. Our findings will assist researchers in discerning the appropriate circumstances and methodologies for employing these models in various use cases.
AIMay 3, 2018
Dimensional emotion recognition using visual and textual cuesPedro M. Ferreira, Diogo Pernes, Kelwin Fernandes et al.
This paper addresses the problem of automatic emotion recognition in the scope of the One-Minute Gradual-Emotional Behavior challenge (OMG-Emotion challenge). The underlying objective of the challenge is the automatic estimation of emotion expressions in the two-dimensional emotion representation space (i.e., arousal and valence). The adopted methodology is a weighted ensemble of several models from both video and text modalities. For video-based recognition, two different types of visual cues (i.e., face and facial landmarks) were considered to feed a multi-input deep neural network. Regarding the text modality, a sequential model based on a simple recurrent architecture was implemented. In addition, we also introduce a model based on high-level features in order to embed domain knowledge in the learning process. Experimental results on the OMG-Emotion validation set demonstrate the effectiveness of the implemented ensemble model as it clearly outperforms the current baseline methods.