Nikolaos Polatidis

IR
h-index58
13papers
339citations
Novelty18%
AI Score43

13 Papers

26.5LGApr 15
Evaluating Supervised Machine Learning Models: Principles, Pitfalls, and Metric Selection

Xuanyan Liu, Ignacio Cabrera Martin, Marcello Trovati et al.

The evaluation of supervised machine learning models is a critical stage in the development of reliable predictive systems. Despite the widespread availability of machine learning libraries and automated workflows, model assessment is often reduced to the reporting of a small set of aggregate metrics, which can lead to misleading conclusions about real-world performance. This paper examines the principles, challenges, and practical considerations involved in evaluating supervised learning algorithms across classification and regression tasks. In particular, it discusses how evaluation outcomes are influenced by dataset characteristics, validation design, class imbalance, asymmetric error costs, and the choice of performance metrics. Through a series of controlled experimental scenarios using diverse benchmark datasets, the study highlights common pitfalls such as the accuracy paradox, data leakage, inappropriate metric selection, and overreliance on scalar summary measures. The paper also compares alternative validation strategies and emphasizes the importance of aligning model evaluation with the intended operational objective of the task. By presenting evaluation as a decision-oriented and context-dependent process, this work provides a structured foundation for selecting metrics and validation protocols that support statistically sound, robust, and trustworthy supervised machine learning systems.

LGDec 9, 2025
Long-Sequence LSTM Modeling for NBA Game Outcome Prediction Using a Novel Multi-Season Dataset

Charles Rios, Longzhen Han, Almas Baimagambetov et al.

Predicting the outcomes of professional basketball games, particularly in the National Basketball Association (NBA), has become increasingly important for coaching strategy, fan engagement, and sports betting. However, many existing prediction models struggle with concept drift, limited temporal context, and instability across seasons. To advance forecasting in this domain, we introduce a newly constructed longitudinal NBA dataset covering the 2004-05 to 2024-25 seasons and present a deep learning framework designed to model long-term performance trends. Our primary contribution is a Long Short-Term Memory (LSTM) architecture that leverages an extended sequence length of 9,840 games equivalent to eight full NBA seasons to capture evolving team dynamics and season-over-season dependencies. We compare this model against several traditional Machine Learning (ML) and Deep Learning (DL) baselines, including Logistic Regression, Random Forest, Multi-Layer Perceptron (MLP), and Convolutional Neural Network (CNN). The LSTM achieves the best performance across all metrics, with 72.35 accuracy, 73.15 precision and 76.13 AUC-ROC. These results demonstrate the importance of long-sequence temporal modeling in basketball outcome prediction and highlight the value of our new multi-season dataset for developing robust, generalizable NBA forecasting systems.

LGOct 8, 2025
Utilizing Large Language Models for Machine Learning Explainability

Alexandros Vassiliades, Nikolaos Polatidis, Stamatios Samaras et al.

This study explores the explainability capabilities of large language models (LLMs), when employed to autonomously generate machine learning (ML) solutions. We examine two classification tasks: (i) a binary classification problem focused on predicting driver alertness states, and (ii) a multilabel classification problem based on the yeast dataset. Three state-of-the-art LLMs (i.e. OpenAI GPT, Anthropic Claude, and DeepSeek) are prompted to design training pipelines for four common classifiers: Random Forest, XGBoost, Multilayer Perceptron, and Long Short-Term Memory networks. The generated models are evaluated in terms of predictive performance (recall, precision, and F1-score) and explainability using SHAP (SHapley Additive exPlanations). Specifically, we measure Average SHAP Fidelity (Mean Squared Error between SHAP approximations and model outputs) and Average SHAP Sparsity (number of features deemed influential). The results reveal that LLMs are capable of producing effective and interpretable models, achieving high fidelity and consistent sparsity, highlighting their potential as automated tools for interpretable ML pipeline generation. The results show that LLMs can produce effective, interpretable pipelines with high fidelity and consistent sparsity, closely matching manually engineered baselines.

CRSep 30, 2025
LLM-Generated Samples for Android Malware Detection

Nik Rollinson, Nikolaos Polatidis

Android malware continues to evolve through obfuscation and polymorphism, posing challenges for both signature-based defenses and machine learning models trained on limited and imbalanced datasets. Synthetic data has been proposed as a remedy for scarcity, yet the role of large language models (LLMs) in generating effective malware data for detection tasks remains underexplored. In this study, we fine-tune GPT-4.1-mini to produce structured records for three malware families: BankBot, Locker/SLocker, and Airpush/StopSMS, using the KronoDroid dataset. After addressing generation inconsistencies with prompt engineering and post-processing, we evaluate multiple classifiers under three settings: training with real data only, real-plus-synthetic data, and synthetic data alone. Results show that real-only training achieves near perfect detection, while augmentation with synthetic data preserves high performance with only minor degradations. In contrast, synthetic-only training produces mixed outcomes, with effectiveness varying across malware families and fine-tuning strategies. These findings suggest that LLM-generated malware can enhance scarce datasets without compromising detection accuracy, but remains insufficient as a standalone training source.

MMMay 29, 2025
A Survey of Generative Categories and Techniques in Multimodal Large Language Models

Longzhen Han, Awes Mubarak, Almas Baimagambetov et al.

Multimodal Large Language Models (MLLMs) have rapidly evolved beyond text generation, now spanning diverse output modalities including images, music, video, human motion, and 3D objects, by integrating language with other sensory modalities under unified architectures. This survey categorises six primary generative modalities and examines how foundational techniques, namely Self-Supervised Learning (SSL), Mixture of Experts (MoE), Reinforcement Learning from Human Feedback (RLHF), and Chain-of-Thought (CoT) prompting, enable cross-modal capabilities. We analyze key models, architectural trends, and emergent cross-modal synergies, while highlighting transferable techniques and unresolved challenges. Architectural innovations like transformers and diffusion models underpin this convergence, enabling cross-modal transfer and modular specialization. We highlight emerging patterns of synergy, and identify open challenges in evaluation, modularity, and structured reasoning. This survey offers a unified perspective on MLLM development and identifies critical paths toward more general-purpose, adaptive, and interpretable multimodal systems.

LGMay 23, 2025
Evolving Machine Learning: A Survey

Ignacio Cabrera Martin, Subhaditya Mukherjee, Almas Baimagambetov et al.

In an era defined by rapid data evolution, traditional Machine Learning (ML) models often fall short in adapting to dynamic environments. Evolving Machine Learning (EML) has emerged as a critical paradigm, enabling continuous learning and adaptation in real-time data streams. This survey presents a comprehensive analysis of EML, focusing on five core challenges: data drift, concept drift, catastrophic forgetting, skewed learning, and network adaptation. We systematically review over 100 studies, categorizing state-of-the-art methods across supervised, unsupervised, and semi-supervised approaches. The survey explores diverse evaluation metrics, benchmark datasets, and real-world applications, offering a comparative lens on the effectiveness and limitations of current techniques. Additionally, we highlight the growing role of adaptive neural architectures, meta-learning, and ensemble strategies in addressing evolving data complexities. By synthesizing insights from recent literature, this work not only maps the current landscape of EML but also identifies critical gaps and opportunities for future research. Our findings aim to guide researchers and practitioners in developing robust, ethical, and scalable EML systems for real-world deployment.

IRMay 6, 2018
Mobile recommender systems: Identifying the major concepts

Elias Pimenidis, Nikolaos Polatidis, Haralambos Mouratidis

This paper identifies the factors that have an impact on mobile recommender systems. Recommender systems have become a technology that has been widely used by various online applications in situations where there is an information overload problem. Numerous applications such as e-Commerce, video platforms and social networks provide personalized recommendations to their users and this has improved the user experience and vendor revenues. The development of recommender systems has been focused mostly on the proposal of new algorithms that provide more accurate recommendations. However, the use of mobile devices and the rapid growth of the internet and networking infrastructure has brought the necessity of using mobile recommender systems. The links between web and mobile recommender systems are described along with how the recommendations in mobile environments can be improved. This work is focused on identifying the links between web and mobile recommender systems and to provide solid future directions that aim to lead in a more integrated mobile recommendation domain.

IRApr 26, 2018
From product recommendation to cyber-attack prediction: Generating attack graphs and predicting future attacks

Nikolaos Polatidis, Elias Pimenidis, Michalis Pavlidis et al.

Modern information society depends on reliable functionality of information systems infrastructure, while at the same time the number of cyber-attacks has been increasing over the years and damages have been caused. Furthermore, graphs can be used to show paths than can be exploited by attackers to intrude into systems and gain unauthorized access through vulnerability exploitation. This paper presents a method that builds attack graphs using data supplied from the maritime supply chain infrastructure. The method delivers all possible paths that can be exploited to gain access. Then, a recommendation system is utilized to make predictions about future attack steps within the network. We show that recommender systems can be used in cyber defense by predicting attacks. The goal of this paper is to identify attack paths and show how a recommendation method can be used to classify future cyber-attacks in terms of risk management. The proposed method has been experimentally evaluated and validated, with the results showing that it is both practical and effective.

IRApr 24, 2018
A multi-level collaborative filtering method that improves recommendations

Nikolaos Polatidis, Christos K. Georgiadis

Collaborative filtering is one of the most used approaches for providing recommendations in various online environments. Even though collaborative recommendation methods have been widely utilized due to their simplicity and ease of use, accuracy is still an issue. In this paper we propose a multi-level recommendation method with its main purpose being to assist users in decision making by providing recommendations of better quality. The proposed method can be applied in different online domains that use collaborative recommender systems, thus improving the overall user experience. The efficiency of the proposed method is shown by providing an extensive experimental evaluation using five real datasets and with comparisons to alternatives.

IRFeb 6, 2017
A dynamic multi-level collaborative filtering method for improved recommendations

Nikolaos Polatidis, Christos K. Georgiadis

One of the most used approaches for providing recommendations in various online environments such as e-commerce is collaborative filtering. Although, this is a simple method for recommending items or services, accuracy and quality problems still exist. Thus, we propose a dynamic multi-level collaborative filtering method that improves the quality of the recommendations. The proposed method is based on positive and negative adjustments and can be used in different domains that utilize collaborative filtering to increase the quality of the user experience. Furthermore, the effectiveness of the proposed method is shown by providing an extensive experimental evaluation based on three real datasets and by comparisons to alternative methods.

IRAug 29, 2014
Mobile recommender systems: An overview of technologies and challenges

Nikolaos Polatidis, Christos K. Georgiadis

The use of mobile devices in combination with the rapid growth of the internet has generated an information overload problem. Recommender systems is a necessity to decide which of the data are relevant to the user. However in mobile devices there are different factors who are crucial to information retrieval, such as the location, the screen size and the processor speed. This paper gives an overview of the technologies related to mobile recommender systems and a more detailed description of the challenged faced.

HCAug 29, 2014
Factors Influencing the Quality of the User Experience in Ubiquitous Recommender Systems

Nikolaos Polatidis, Christos K. Georgiadis

The use of mobile devices and the rapid growth of the internet and networking infrastructure has brought the necessity of using Ubiquitous recommender systems. However in mobile devices there are different factors that need to be considered in order to get more useful recommendations and increase the quality of the user experience. This paper gives an overview of the factors related to the quality and proposes a new hybrid recommendation model.

CYAug 28, 2014
Chatbot for admissions

Nikolaos Polatidis

The communication of potential students with a university department is performed manually and it is a very time consuming procedure. The opportunity to communicate with on a one-to-one basis is highly valued. However with many hundreds of applications each year, one-to-one conversations are not feasible in most cases. The communication will require a member of academic staff to expend several hours to find suitable answers and contact each student. It would be useful to reduce his costs and time. The project aims to reduce the burden on the head of admissions, and potentially other users, by developing a convincing chatbot. A suitable algorithm must be devised to search through the set of data and find a potential answer. The program then replies to the user and provides a relevant web link if the user is not satisfied by the answer. Furthermore a web interface is provided for both users and an administrator. The achievements of the project can be summarised as follows. To prepare the background of the project a literature review was undertaken, together with an investigation of existing tools, and consultation with the head of admissions. The requirements of the system were established and a range of algorithms and tools were investigated, including keyword and template matching. An algorithm that combines keyword matching with string similarity has been developed. A usable system using the proposed algorithm has been implemented. The system was evaluated by keeping logs of questions and answers and by feedback received by potential students that used it.