CROct 31, 2019Code
A machine-learning approach to Detect users' suspicious behaviour through the Facebook wallAimilia Panagiotou, Bogdan Ghita, Stavros Shiaeles et al.
Facebook represents the current de-facto choice for social media, changing the nature of social relationships. The increasing amount of personal information that runs through this platform publicly exposes user behaviour and social trends, allowing aggregation of data through conventional intelligence collection techniques such as OSINT (Open Source Intelligence). In this paper, we propose a new method to detect and diagnose variations in overall Facebook user psychology through Open Source Intelligence (OSINT) and machine learning techniques. We are aggregating the spectrum of user sentiments and views by using N-Games charts, which exhibit noticeable variations over time, validated through long term collection. We postulate that the proposed approach can be used by security organisations to understand and evaluate the user psychology, then use the information to predict insider threats or prevent insider attacks.
SEApr 29
Understanding the Skills Gap between Higher Education Institutions and the Software Engineering IndustryHuy Phan, Ievgeniia Kuzminykh, Bogdan Ghita
In the rapidly evolving field of software engineering, the skills required of graduates entering the job market are constantly changing. Several studies have identified a gap between the skills taught in university curricula and those demanded by the software engineering industry. This chapter investigates the technical skill and expertise gap between higher education institutions (HEIs) and the UK software engineering industry by mapping job descriptions to the skills included in computer science degree programmes. A custom web scraping and text analysis tool, utilising fuzzy matching, was developed to extract and categorise skills from 300 job postings and undergraduate curricula from 30 UK universities. The analysis showed that the curricula place a strong emphasis on Programming Languages (18%) and Database Management (12.83%). In contrast, the industry s most frequently requested skill category is Software Design and Planning, which appears in approximately 88.68% of job descriptions, highlighting its critical importance. General Programming Language and System Structures also show strong demand, present in over 78.30% and 66.04% of postings, respectively. The mapping indicates that areas such as System Structures and Software Domains are significantly underrepresented in curricula, while Database Management and Compiler Design may be overemphasised. These insights can support HEIs in aligning their programmes with industry needs, supporting the preparation of graduates for dynamic careers in software engineering.
CRApr 1
S-DAPT-2026: A Stage-Aware Synthetic Dataset for Advanced Persistent Threat DetectionSaleem Ishaq Tijjani, Bogdan Ghita, Nathan Clarke et al.
The detection of advanced persistent threats (APTs) remains a crucial challenge due to their stealthy, multistage nature and the limited availability of realistic, labeled datasets for systematic evaluation. Synthetic dataset generation has emerged as a practical approach for modeling APT campaigns; however, existing methods often rely on computationally expensive alert correlation mechanisms that limit scalability. Motivated by these limitations, this paper presents a near realistic synthetic APT dataset and an efficient alert correlation framework. The proposed approach introduces a machine learning based correlation module that employs K Nearest Neighbors (KNN) clustering with a cosine similarity metric to group semantically related alerts within a temporal context. The dataset emulates multistage APT campaigns across campus and organizational network environments and captures a diverse set of fourteen distinct alert types, exceeding the coverage of commonly used synthetic APT datasets. In addition, explicit APT campaign states and alert to stage mappings are defined to enable flexible integration of new alert types and support stage aware analysis. A comprehensive statistical characterization of the dataset is provided to facilitate reproducibility and support APT stage predictions.
CRApr 1
Deep Recurrent Hidden Markov Learning Framework for Multi-Stage Advanced Persistent Threat PredictionSaleem Ishaq Tijjani, Bogdan Ghita, Nathan Clarke et al.
Advanced Persistent Threats (APTs) represent hidden, multi\-stage cyberattacks whose long term persistence and adaptive behavior challenge conventional intrusion detection systems (IDS). Although recent advances in machine learning and probabilistic modeling have improved APT detection performance, most existing approaches remain reactive and alert\-centric, providing limited capability for stage-aware prediction and principled inference under uncertainty, particularly when observations are sparse or incomplete. This paper proposes E\-HiDNet, a unified hybrid deep probabilistic learning framework that integrates convolutional and recurrent neural networks with a Hidden Markov Model (HMM) to allow accurate prediction of the progression of the APT campaign. The deep learning component extracts hierarchical spatio\-temporal representations from correlated alert sequences, while the HMM models latent attack stages and their stochastic transitions, allowing principled inference under uncertainty and partial observability. A modified Viterbi algorithm is introduced to handle incomplete observations, ensuring robust decoding under uncertainty. The framework is evaluated using a synthetically generated yet structurally realistic APT dataset (S\-DAPT\-2026). Simulation results show that E\-HiDNet achieves up to 98.8\-100\% accuracy in stage prediction and significantly outperforms standalone HMMs when four or more observations are available, even under reduced training data scenarios. These findings highlight that combining deep semantic feature learning with probabilistic state\-space modeling enhances predictive APT stage performance and situational awareness for proactive APT defense.
HCOct 14, 2024
Personalised Feedback Framework for Online Education Programmes Using Generative AIIevgeniia Kuzminykh, Tareita Nawaz, Shihao Shenzhang et al.
AI tools, particularly large language modules, have recently proven their effectiveness within learning management systems and online education programmes. As feedback continues to play a crucial role in learning and assessment in schools, educators must carefully customise the use of AI tools in order to optimally support students in their learning journey. Efforts to improve educational feedback systems have seen numerous attempts reflected in the research studies but mostly have been focusing on qualitatively benchmarking AI feedback against human-generated feedback. This paper presents an exploration of an alternative feedback framework which extends the capabilities of ChatGPT by integrating embeddings, enabling a more nuanced understanding of educational materials and facilitating topic-targeted feedback for quiz-based assessments. As part of the study, we proposed and developed a proof of concept solution, achieving an efficacy rate of 90% and 100% for open-ended and multiple-choice questions, respectively. The results showed that our framework not only surpasses expectations but also rivals human narratives, highlighting the potential of AI in revolutionising educational feedback mechanisms.
SDSep 21, 2021
Audio Interval Retrieval using Convolutional Neural NetworksIevgeniia Kuzminykh, Dan Shevchuk, Stavros Shiaeles et al.
Modern streaming services are increasingly labeling videos based on their visual or audio content. This typically augments the use of technologies such as AI and ML by allowing to use natural speech for searching by keywords and video descriptions. Prior research has successfully provided a number of solutions for speech to text, in the case of a human speech, but this article aims to investigate possible solutions to retrieve sound events based on a natural language query, and estimate how effective and accurate they are. In this study, we specifically focus on the YamNet, AlexNet, and ResNet-50 pre-trained models to automatically classify audio samples using their respective melspectrograms into a number of predefined classes. The predefined classes can represent sounds associated with actions within a video fragment. Two tests are conducted to evaluate the performance of the models on two separate problems: audio classification and intervals retrieval based on a natural language query. Results show that the benchmarked models are comparable in terms of performance, with YamNet slightly outperforming the other two models. YamNet was able to classify single fixed-size audio samples with 92.7% accuracy and 68.75% precision while its average accuracy on intervals retrieval was 71.62% and precision was 41.95%. The investigated method may be embedded into an automated event marking architecture for streaming services.
CRSep 21, 2021
Comparative Analysis of Cryptographic Key Management SystemsLevgeniia Kuzminykh, Bogdan Ghita, Stavros Shiaeles
Managing cryptographic keys can be a complex task for an enterprise and particularly difficult to scale when an increasing number of users and applications need to be managed. In order to address scalability issues, typical IT infrastructures employ key management systems that are able to handle a large number of encryption keys and associate them with the authorized requests. Given their necessity, recent years have witnessed a variety of key management systems, aligned with the features, quality, price and security needs of specific organisations. While the spectrum of such solutions is welcome and demonstrates the expanding nature of the market, it also makes it time consuming for IT managers to identify the appropriate system for their respective company needs. This paper provides a list of key management tools which include a minimum set of features, such as availability of secure database for managing keys, an authentication, authorization, and access control model for restricting and managing access to keys, effective logging of actions with keys, and the presence of an API for accessing functions directly from the application code. Five systems were comprehensively compared by evaluating the attributes related to complexity of the implementation, its popularity, linked vulnerabilities and technical performance in terms of response time and network usage. These were Pinterest Knox, Hashicorp Vault, Square Keywhiz, OpenStack Barbican, and Cyberark Conjur. Out of these five, Hachicorp Vault was determined to be the most suitable system for small businesses.
CRSep 20, 2021
A Novel Online Incremental Learning Intrusion Prevention SystemChristos Constantinides, Stavros Shiaeles, Bogdan Ghita et al.
Attack vectors are continuously evolving in order to evade Intrusion Detection systems. Internet of Things (IoT) environments, while beneficial for the IT ecosystem, suffer from inherent hardware limitations, which restrict their ability to implement comprehensive security measures and increase their exposure to vulnerability attacks. This paper proposes a novel Network Intrusion Prevention System that utilises a SelfOrganizing Incremental Neural Network along with a Support Vector Machine. Due to its structure, the proposed system provides a security solution that does not rely on signatures or rules and is capable to mitigate known and unknown attacks in real-time with high accuracy. Based on our experimental results with the NSL KDD dataset, the proposed framework can achieve on-line updated incremental learning, making it suitable for efficient and scalable industrial applications.
CRSep 8, 2021
Malware Squid: A Novel IoT Malware Traffic Analysis Framework using Convolutional Neural Network and Binary VisualisationRobert Shire, Stavros Shiaeles, Keltoum Bendiab et al.
Internet of Things devices have seen a rapid growth and popularity in recent years with many more ordinary devices gaining network capability and becoming part of the ever growing IoT network. With this exponential growth and the limitation of resources, it is becoming increasingly harder to protect against security threats such as malware due to its evolving faster than the defence mechanisms can handle with. The traditional security systems are not able to detect unknown malware as they use signature-based methods. In this paper, we aim to address this issue by introducing a novel IoT malware traffic analysis approach using neural network and binary visualisation. The prime motivation of the proposed approach is to faster detect and classify new malware (zero-day malware). The experiment results show that our method can satisfy the accuracy requirement of practical application.
CRSep 6, 2021
A Novel Multimodal Biometric Authentication System using Machine Learning and BlockchainRichard Brown, Gueltoum Bendiab, Stavros Shiaeles et al.
Traditional authentication systems that rely on simple passwords, PIN numbers or tokens have many security issues, like easily guessed passwords, PIN numbers written on the back of cards, etc. Thus, biometric authentication methods that rely on physical and behavioural characteristics have been proposed as an alternative for those systems. In real-world applications, authentication systems that involve a single biometric faced many issues, especially lack of accuracy and noisy data, which boost the research community to create multibiometric systems that involve a variety of biometrics. Those systems provide better performance and higher accuracy compared to other authentication methods. However, most of them are inconvenient and requires complex interactions from the user. Thus, in this paper, we introduce a novel multimodal authentication system that relies on machine learning and blockchain, with the aim of providing a more secure, transparent, and convenient authentication mechanism. The proposed system combines four important biometrics, fingerprint, face, age, and gender. The supervised learning algorithm Decision Tree has been used to combine the results of the biometrics verification process and produce a confidence level related to the user. The initial experimental results show the efficiency and robustness of the proposed multimodal systems.
CRSep 6, 2021
Detection of Insider Threats using Artificial Intelligence and VisualisationVasileios Koutsouvelis, Stavros Shiaeles, Bogdan Ghita et al.
Insider threats are one of the most damaging risk factors for the IT systems and infrastructure of a company or an organization; identification of insider threats has prompted the interest of the world academic research community, with several solutions having been proposed to alleviate their potential impact. For the implementation of the experimental stage described in this study, the Convolutional Neural Network (from now on CNN) algorithm was used and implemented via the Google TensorFlow program, which was trained to identify potential threats from images produced by the available dataset. From the examination of the images that were produced and with the help of Machine Learning, the question of whether the activity of each user is classified as malicious or not for the Information System was answered.
CRDec 7, 2020
The Challenges with Internet of Things for BusinessIevgeniia Kuzminykh, Bogdan Ghita, Jose M. Such
Many companies consider IoT as a central element for increasing competitiveness. Despite the growing number of cyberattacks on IoT devices and the importance of IoT security, no study has yet primarily focused on the impact of IoT security measures on the security challenges. This paper presents a review of the current state of security of IoT in companies that produce IoT products and have begun a transformation towards the digitalization of their products and the associated production processes. The analysis of challenges in IoT security was conducted based on the review of resources and reports on IoT security, while mapping the relevant solutions/measures for strengthening security to the existing challenges. This mapping assists stakeholders in understanding the IoT security initiatives regarding their business needs and issues. Based on the analysis, we conclude that almost all companies have an understanding of basic security measures as encryption, but do not understand threat surface and not aware of advanced methods of protecting data and devices. The analysis shows that most companies do not have internal experts in IoT security and prefer to outsource security operations to security providers.
CRDec 7, 2020
Impact of Network and Host Characteristics on the Keystroke Pattern in Remote Desktop SessionsIevgeniia Kuzminykh, Bogdan Ghita, Alexandr Silonosov
Authentication based on keystroke dynamics is a convenient biometric approach, easy in use, transparent, and cheap as it does not require a dedicated sensor. Keystroke authentication, as part of multi factor authentication, can be used in remote display access to guarantee the security of use of remote connectivity systems during the access control phase or throughout the session. This paper investigates how network conditions and additional host interaction may impact the behavioural pattern of keystrokes when used in a remote desktop application scenario. We focus on the timing of adjacent keys and investigate this impact by calculating the variations of the Euclidean distance between a reference profile and resulting profiles following such impairments. The experimental results indicate that variations of congestion latency, whether produced by adjacent traffic sources or by additional remote desktop interactions, have a substantive impact on the Euclidian distance, which in turn may affect the effectiveness of the biometric authentication algorithm. Results also indicate that data flows within remote desktop protocol are not prioritized and therefore additional traffic will have a significant impact on the keystroke timings, which renders continuous authentication less effective for remote access and more appropriate for one-time login.
CRMar 12, 2019
Agent-based Vs Agent-less Sandbox for Dynamic Behavioral AnalysisMuhammad Ali, Stavros Shiaeles, Maria Papadaki et al.
Malicious software is detected and classified by either static analysis or dynamic analysis. In static analysis, malware samples are reverse engineered and analyzed so that signatures of malware can be constructed. These techniques can be easily thwarted through polymorphic, metamorphic malware, obfuscation and packing techniques, whereas in dynamic analysis malware samples are executed in a controlled environment using the sandboxing technique, in order to model the behavior of malware. In this paper, we have analyzed Petya, Spyeye, VolatileCedar, PAFISH etc. through Agent-based and Agentless dynamic sandbox systems in order to investigate and benchmark their efficiency in advanced malware detection.
NIMar 12, 2019
Detection of LDDoS Attacks Based on TCP Connection ParametersMichael Siracusano, Stavros Shiaeles, Bogdan Ghita
Low-rate application layer distributed denial of service (LDDoS) attacks are both powerful and stealthy. They force vulnerable webservers to open all available connections to the adversary, denying resources to real users. Mitigation advice focuses on solutions that potentially degrade quality of service for legitimate connections. Furthermore, without accurate detection mechanisms, distributed attacks can bypass these defences. A methodology for detection of LDDoS attacks, based on characteristics of malicious TCP flows, is proposed within this paper. Research will be conducted using combinations of two datasets: one generated from a simulated network, the other from the publically available CIC DoS dataset. Both contain the attacks slowread, slowheaders and slowbody, alongside legitimate web browsing. TCP flow features are extracted from all connections. Experimentation was carried out using six supervised AI algorithms to categorise attack from legitimate flows. Decision trees and k-NN accurately classified up to 99.99% of flows, with exceptionally low false positive and false negative rates, demonstrating the potential of AI in LDDoS detection.