Ferhat Özgür Çatak

h-index25

32papers

668citations

Novelty33%

AI Score45

Ranked #40,619 of 194,257 authors (top 21%)#9,465 in LG (top 24%)

32 Papers

18.2CRApr 16, 2022

Homomorphic Encryption and Federated Learning based Privacy-Preserving CNN Training: COVID-19 Detection Use-Case

Febrianti Wibawa, Ferhat Ozgur Catak, Salih Sarp et al.

Medical data is often highly sensitive in terms of data privacy and security concerns. Federated learning, one type of machine learning techniques, has been started to use for the improvement of the privacy and security of medical data. In the federated learning, the training data is distributed across multiple machines, and the learning process is performed in a collaborative manner. There are several privacy attacks on deep learning (DL) models to get the sensitive information by attackers. Therefore, the DL model itself should be protected from the adversarial attack, especially for applications using medical data. One of the solutions for this problem is homomorphic encryption-based model protection from the adversary collaborator. This paper proposes a privacy-preserving federated learning algorithm for medical data using homomorphic encryption. The proposed algorithm uses a secure multi-party computation protocol to protect the deep learning model from the adversaries. In this study, the proposed algorithm using a real-world medical dataset is evaluated in terms of the model performance.

7.0CRAug 12, 2022Code

Defensive Distillation based Adversarial Attacks Mitigation Method for Channel Estimation using Deep Learning Models in Next-Generation Wireless Networks

Ferhat Ozgur Catak, Murat Kuzlu, Evren Catak et al.

Future wireless networks (5G and beyond) are the vision of forthcoming cellular systems, connecting billions of devices and people together. In the last decades, cellular networks have been dramatically growth with advanced telecommunication technologies for high-speed data transmission, high cell capacity, and low latency. The main goal of those technologies is to support a wide range of new applications, such as virtual reality, metaverse, telehealth, online education, autonomous and flying vehicles, smart cities, smart grids, advanced manufacturing, and many more. The key motivation of NextG networks is to meet the high demand for those applications by improving and optimizing network functions. Artificial Intelligence (AI) has a high potential to achieve these requirements by being integrated in applications throughout all layers of the network. However, the security concerns on network functions of NextG using AI-based models, i.e., model poising, have not been investigated deeply. Therefore, it needs to design efficient mitigation techniques and secure solutions for NextG networks using AI-based methods. This paper proposes a comprehensive vulnerability analysis of deep learning (DL)-based channel estimation models trained with the dataset obtained from MATLAB's 5G toolbox for adversarial attacks and defensive distillation-based mitigation methods. The adversarial attacks produce faulty results by manipulating trained DL-based models for channel estimation in NextG networks, while making models more robust against any attacks through mitigation methods. This paper also presents the performance of the proposed defensive distillation mitigation method for each adversarial attack against the channel estimation model. The results indicated that the proposed mitigation method can defend the DL-based channel estimation models against adversarial attacks in NextG networks.

2.7LGFeb 2Code

EvalQReason: A Framework for Step-Level Reasoning Evaluation in Large Language Models

Shaima Ahmad Freja, Ferhat Ozgur Catak, Betul Yurdem et al.

Large Language Models (LLMs) are increasingly deployed in critical applications requiring reliable reasoning, yet their internal reasoning processes remain difficult to evaluate systematically. Existing methods focus on final-answer correctness, providing limited insight into how reasoning unfolds across intermediate steps. We present EvalQReason, a framework that quantifies LLM reasoning quality through step-level probability distribution analysis without requiring human annotation. The framework introduces two complementary algorithms: Consecutive Step Divergence (CSD), which measures local coherence between adjacent reasoning steps, and Step-to-Final Convergence (SFC), which assesses global alignment with final answers. Each algorithm employs five statistical metrics to capture reasoning dynamics. Experiments across mathematical and medical datasets with open-source 7B-parameter models demonstrate that CSD-based features achieve strong predictive performance for correctness classification, with classical machine learning models reaching F1=0.78 and ROC-AUC=0.82, and sequential neural models substantially improving performance (F1=0.88, ROC-AUC=0.97). CSD consistently outperforms SFC, and sequential architectures outperform classical machine learning approaches. Critically, reasoning dynamics prove domain-specific: mathematical reasoning exhibits clear divergence-based discrimination patterns between correct and incorrect solutions, while medical reasoning shows minimal discriminative signals, revealing fundamental differences in how LLMs process different reasoning types. EvalQReason enables scalable, process-aware evaluation of reasoning reliability, establishing probability-based divergence analysis as a principled approach for trustworthy AI deployment.

1.8LGDec 5, 2022

Anomaly Detection in Power Markets and Systems

Ugur Halden, Umit Cali, Ferhat Ozgur Catak et al.

The widespread use of information and communication technology (ICT) over the course of the last decades has been a primary catalyst behind the digitalization of power systems. Meanwhile, as the utilization rate of the Internet of Things (IoT) continues to rise along with recent advancements in ICT, the need for secure and computationally efficient monitoring of critical infrastructures like the electrical grid and the agents that participate in it is growing. A cyber-physical system, such as the electrical grid, may experience anomalies for a number of different reasons. These may include physical defects, mistakes in measurement and communication, cyberattacks, and other similar occurrences. The goal of this study is to emphasize what the most common incidents are with power systems and to give an overview and classification of the most common ways to find problems, starting with the consumer/prosumer end working up to the primary power producers. In addition, this article aimed to discuss the methods and techniques, such as artificial intelligence (AI) that are used to identify anomalies in the power systems and markets.

3.3LGSep 21, 2022

Hybrid AI-based Anomaly Detection Model using Phasor Measurement Unit Data

Yuval Abraham Regev, Henrik Vassdal, Ugur Halden et al.

Over the last few decades, extensive use of information and communication technologies has been the main driver of the digitalization of power systems. Proper and secure monitoring of the critical grid infrastructure became an integral part of the modern power system. Using phasor measurement units (PMUs) to surveil the power system is one of the technologies that have a promising future. Increased frequency of measurements and smarter methods for data handling can improve the ability to reliably operate power grids. The increased cyber-physical interaction offers both benefits and drawbacks, where one of the drawbacks comes in the form of anomalies in the measurement data. The anomalies can be caused by both physical faults on the power grid, as well as disturbances, errors, and cyber attacks in the cyber layer. This paper aims to develop a hybrid AI-based model that is based on various methods such as Long Short Term Memory (LSTM), Convolutional Neural Network (CNN) and other relevant hybrid algorithms for anomaly detection in phasor measurement unit data. The dataset used within this research was acquired by the University of Texas, which consists of real data from grid measurements. In addition to the real data, false data that has been injected to produce anomalies has been analyzed. The impacts and mitigating methods to prevent such kind of anomalies are discussed.

1.2NISep 27, 2022

Mitigating Attacks on Artificial Intelligence-based Spectrum Sensing for Cellular Network Signals

Ferhat Ozgur Catak, Murat Kuzlu, Salih Sarp et al.

Cellular networks (LTE, 5G, and beyond) are dramatically growing with high demand from consumers and more promising than the other wireless networks with advanced telecommunication technologies. The main goal of these networks is to connect billions of devices, systems, and users with high-speed data transmission, high cell capacity, and low latency, as well as to support a wide range of new applications, such as virtual reality, metaverse, telehealth, online education, autonomous and flying vehicles, advanced manufacturing, and many more. To achieve these goals, spectrum sensing has been paid more attention, along with new approaches using artificial intelligence (AI) methods for spectrum management in cellular networks. This paper provides a vulnerability analysis of spectrum sensing approaches using AI-based semantic segmentation models for identifying cellular network signals under adversarial attacks with and without defensive distillation methods. The results showed that mitigation methods can significantly reduce the vulnerabilities of AI-based spectrum sensing models against adversarial attacks.

2.3CRJul 11, 2024

Neural Networks Meet Elliptic Curve Cryptography: A Novel Approach to Secure Communication

Mina Cecilie Wøien, Ferhat Ozgur Catak, Murat Kuzlu et al.

In recent years, neural networks have been used to implement symmetric cryptographic functions for secure communications. Extending this domain, the proposed approach explores the application of asymmetric cryptography within a neural network framework to safeguard the exchange between two communicating entities, i.e., Alice and Bob, from an adversarial eavesdropper, i.e., Eve. It employs a set of five distinct cryptographic keys to examine the efficacy and robustness of communication security against eavesdropping attempts using the principles of elliptic curve cryptography. The experimental setup reveals that Alice and Bob achieve secure communication with negligible variation in security effectiveness across different curves. It is also designed to evaluate cryptographic resilience. Specifically, the loss metrics for Bob oscillate between 0 and 1 during encryption-decryption processes, indicating successful message comprehension post-encryption by Alice. The potential vulnerability with a decryption accuracy exceeds 60\%, where Eve experiences enhanced adversarial training, receiving twice the training iterations per batch compared to Alice and Bob.

3.0CRMar 5

Quantum Key Distribution Secured Federated Learning for Channel Estimation and Radar Spectrum Sensing in 6G Networks

Ferhat Ozgur Catak, Murat Kuzlu, Jungwon Seo et al.

This paper presents a federated learning framework secured by quantum key distribution (QKD) for wireless channel estimation and radar spectrum sensing in the next generation networks (NextG or Beyond 6G). A BB84-style protocol abstraction and pairwise additive masking are utilized to train clients' local models (CNN for channel estimation, U-Net for radar segmentation) and upload only masked model updates. The server aggregates without observing plain parameters; an eavesdropper without QKD keys cannot recover individual updates. Experiments show that secure FL achieves NMSE of 0.216 for channel estimation and 92.1\% accuracy with 0.72 mIoU for radar sensing. When an eavesdropper is present, QBER rises to $\sim$25\% and all rounds abort as intended; reconstruction error remains below $10^{-5}$, confirming correct aggregation.

1.4LGMar 3

Logit-Level Uncertainty Quantification in Vision-Language Models for Histopathology Image Analysis

Betul Yurdem, Ferhat Ozgur Catak, Murat Kuzlu et al.

Vision-Language Models (VLMs) with their multimodal capabilities have demonstrated remarkable success in almost all domains, including education, transportation, healthcare, energy, finance, law, and retail. Nevertheless, the utilization of VLMs in healthcare applications raises crucial concerns due to the sensitivity of large-scale medical data and the trustworthiness of these models (reliability, transparency, and security). This study proposes a logit-level uncertainty quantification (UQ) framework for histopathology image analysis using VLMs to deal with these concerns. UQ is evaluated for three VLMs using metrics derived from temperature-controlled output logits. The proposed framework demonstrates a critical separation in uncertainty behavior. While VLMs show high stochastic sensitivity (cosine similarity (CS) $<0.71$ and $<0.84$, Jensen-Shannon divergence (JS) $<0.57$ and $<0.38$, and Kullback-Leibler divergence (KL) $<0.55$ and $<0.35$, respectively for mean values of VILA-M3-8B and LLaVA-Med v1.5), near-maximal temperature impacts ($Δ_T \approx 1.00$), and displaying abrupt uncertainty transitions, particularly for complex diagnostic prompts. In contrast, the pathology-specific PRISM model maintains near-deterministic behavior (mean CS $>0.90$, JS $<0.10$, KL $<0.09$) and significantly minimal temperature effects across all prompt complexities. These findings emphasize the importance of logit-level uncertainty quantification to evaluate trustworthiness in histopathology applications utilizing VLMs.

8.0ITJan 3, 2025Code

BERT4MIMO: A Foundation Model using BERT Architecture for Massive MIMO Channel State Information Prediction

Ferhat Ozgur Catak, Murat Kuzlu, Umit Cali

Massive MIMO (Multiple-Input Multiple-Output) is an advanced wireless communication technology, using a large number of antennas to improve the overall performance of the communication system in terms of capacity, spectral, and energy efficiency. The performance of MIMO systems is highly dependent on the quality of channel state information (CSI). Predicting CSI is, therefore, essential for improving communication system performance, particularly in MIMO systems, since it represents key characteristics of a wireless channel, including propagation, fading, scattering, and path loss. This study proposes a foundation model inspired by BERT, called BERT4MIMO, which is specifically designed to process high-dimensional CSI data from massive MIMO systems. BERT4MIMO offers superior performance in reconstructing CSI under varying mobility scenarios and channel conditions through deep learning and attention mechanisms. The experimental results demonstrate the effectiveness of BERT4MIMO in a variety of wireless environments.

14.4LGJan 31, 2025

Understanding Federated Learning from IID to Non-IID dataset: An Experimental Study

Jungwon Seo, Ferhat Ozgur Catak, Chunming Rong

As privacy concerns and data regulations grow, federated learning (FL) has emerged as a promising approach for training machine learning models across decentralized data sources without sharing raw data. However, a significant challenge in FL is that client data are often non-IID (non-independent and identically distributed), leading to reduced performance compared to centralized learning. While many methods have been proposed to address this issue, their underlying mechanisms are often viewed from different perspectives. Through a comprehensive investigation from gradient descent to FL, and from IID to non-IID data settings, we find that inconsistencies in client loss landscapes primarily cause performance degradation in non-IID scenarios. From this understanding, we observe that existing methods can be grouped into two main strategies: (i) adjusting parameter update paths and (ii) modifying client loss landscapes. These findings offer a clear perspective on addressing non-IID challenges in FL and help guide future research in the field.

6.5CVNov 24, 2024

Improving Medical Diagnostics with Vision-Language Models: Convex Hull-Based Uncertainty Analysis

Ferhat Ozgur Catak, Murat Kuzlu, Taylor Patrick

In recent years, vision-language models (VLMs) have been applied to various fields, including healthcare, education, finance, and manufacturing, with remarkable performance. However, concerns remain regarding VLMs' consistency and uncertainty, particularly in critical applications such as healthcare, which demand a high level of trust and reliability. This paper proposes a novel approach to evaluate uncertainty in VLMs' responses using a convex hull approach on a healthcare application for Visual Question Answering (VQA). LLM-CXR model is selected as the medical VLM utilized to generate responses for a given prompt at different temperature settings, i.e., 0.001, 0.25, 0.50, 0.75, and 1.00. According to the results, the LLM-CXR VLM shows a high uncertainty at higher temperature settings. Experimental outcomes emphasize the importance of uncertainty in VLMs' responses, especially in healthcare applications.

7.1LGMar 17, 2025Code

GC-Fed: Gradient Centralized Federated Learning with Partial Client Participation

Jungwon Seo, Ferhat Ozgur Catak, Chunming Rong et al.

Federated Learning (FL) enables privacy-preserving multi-source information fusion (MSIF) but is challenged by client drift in highly heterogeneous data settings. Many existing drift-mitigation strategies rely on reference-based techniques--such as gradient adjustments or proximal loss--that use historical snapshots (e.g., past gradients or previous global models) as reference points. When only a subset of clients participates in each training round, these historical references may not accurately capture the overall data distribution, leading to unstable training. In contrast, our proposed Gradient Centralized Federated Learning (GC-Fed) employs a hyperplane as a historically independent reference point to guide local training and enhance inter-client alignment. GC-Fed comprises two complementary components: Local GC, which centralizes gradients during local training, and Global GC, which centralizes updates during server aggregation. In our hybrid design, Local GC is applied to feature-extraction layers to harmonize client contributions, while Global GC refines classifier layers to stabilize round-wise performance. Theoretical analysis and extensive experiments on benchmark FL tasks demonstrate that GC-Fed effectively mitigates client drift and achieves up to a 20% improvement in accuracy under heterogeneous and partial participation conditions.

13.3AIJun 28, 2024

Uncertainty Quantification in Large Language Models Through Convex Hull Analysis

Ferhat Ozgur Catak, Murat Kuzlu

Uncertainty quantification approaches have been more critical in large language models (LLMs), particularly high-risk applications requiring reliable outputs. However, traditional methods for uncertainty quantification, such as probabilistic models and ensemble techniques, face challenges when applied to the complex and high-dimensional nature of LLM-generated outputs. This study proposes a novel geometric approach to uncertainty quantification using convex hull analysis. The proposed method leverages the spatial properties of response embeddings to measure the dispersion and variability of model outputs. The prompts are categorized into three types, i.e., `easy', `moderate', and `confusing', to generate multiple responses using different LLMs at varying temperature settings. The responses are transformed into high-dimensional embeddings via a BERT model and subsequently projected into a two-dimensional space using Principal Component Analysis (PCA). The Density-Based Spatial Clustering of Applications with Noise (DBSCAN) algorithm is utilized to cluster the embeddings and compute the convex hull for each selected cluster. The experimental results indicate that the uncertainty of the model for LLMs depends on the prompt complexity, the model, and the temperature setting.

5.2CRFeb 16, 2022

The Adversarial Security Mitigations of mmWave Beamforming Prediction Models using Defensive Distillation and Adversarial Retraining

Murat Kuzlu, Ferhat Ozgur Catak, Umit Cali et al.

The design of a security scheme for beamforming prediction is critical for next-generation wireless networks (5G, 6G, and beyond). However, there is no consensus about protecting the beamforming prediction using deep learning algorithms in these networks. This paper presents the security vulnerabilities in deep learning for beamforming prediction using deep neural networks (DNNs) in 6G wireless networks, which treats the beamforming prediction as a multi-output regression problem. It is indicated that the initial DNN model is vulnerable against adversarial attacks, such as Fast Gradient Sign Method (FGSM), Basic Iterative Method (BIM), Projected Gradient Descent (PGD), and Momentum Iterative Method (MIM), because the initial DNN model is sensitive to the perturbations of the adversarial samples of the training data. This study also offers two mitigation methods, such as adversarial training and defensive distillation, for adversarial attacks against artificial intelligence (AI)-based models used in the millimeter-wave (mmWave) beamforming prediction. Furthermore, the proposed scheme can be used in situations where the data are corrupted due to the adversarial examples in the training data. Experimental results show that the proposed methods effectively defend the DNN models against adversarial attacks in next-generation wireless networks.

1.8LGFeb 15, 2022

Unreasonable Effectiveness of Last Hidden Layer Activations for Adversarial Robustness

Omer Faruk Tuna, Ferhat Ozgur Catak, M. Taner Eskil

In standard Deep Neural Network (DNN) based classifiers, the general convention is to omit the activation function in the last (output) layer and directly apply the softmax function on the logits to get the probability scores of each class. In this type of architectures, the loss value of the classifier against any output class is directly proportional to the difference between the final probability score and the label value of the associated class. Standard White-box adversarial evasion attacks, whether targeted or untargeted, mainly try to exploit the gradient of the model loss function to craft adversarial samples and fool the model. In this study, we show both mathematically and experimentally that using some widely known activation functions in the output layer of the model with high temperature values has the effect of zeroing out the gradients for both targeted and untargeted attack cases, preventing attackers from exploiting the model's loss function to craft adversarial samples. We've experimentally verified the efficacy of our approach on MNIST (Digit), CIFAR10 datasets. Detailed experiments confirmed that our approach substantially improves robustness against gradient-based targeted and untargeted attack threats. And, we showed that the increased non-linearity at the output layer has some additional benefits against some other attack methods like Deepfool attack.

10.7CRSep 29, 2021

Secure Multi-Party Computation based Privacy Preserving Data Analysis in Healthcare IoT Systems

Kevser Şahinbaş, Ferhat Ozgur Catak

Recently, many innovations have been experienced in healthcare by rapidly growing Internet-of-Things (IoT) technology that provides significant developments and facilities in the health sector and improves daily human life. The IoT bridges people, information technology and speed up shopping. For these reasons, IoT technology has started to be used on a large scale. Thanks to the use of IoT technology in health services, chronic disease monitoring, health monitoring, rapid intervention, early diagnosis and treatment, etc. facilitates the delivery of health services. However, the data transferred to the digital environment pose a threat of privacy leakage. Unauthorized persons have used them, and there have been malicious attacks on the health and privacy of individuals. In this study, it is aimed to propose a model to handle the privacy problems based on federated learning. Besides, we apply secure multi party computation. Our proposed model presents an extensive privacy and data analysis and achieve high performance.

8.0CVJul 11, 2021Code

Prediction Surface Uncertainty Quantification in Object Detection Models for Autonomous Driving

Ferhat Ozgur Catak, Tao Yue, Shaukat Ali

Object detection in autonomous cars is commonly based on camera images and Lidar inputs, which are often used to train prediction models such as deep artificial neural networks for decision making for object recognition, adjusting speed, etc. A mistake in such decision making can be damaging; thus, it is vital to measure the reliability of decisions made by such prediction models via uncertainty measurement. Uncertainty, in deep learning models, is often measured for classification problems. However, deep learning models in autonomous driving are often multi-output regression models. Hence, we propose a novel method called PURE (Prediction sURface uncErtainty) for measuring prediction uncertainty of such regression models. We formulate the object recognition problem as a regression model with more than one outputs for finding object locations in a 2-dimensional camera view. For evaluation, we modified three widely-applied object recognition models (i.e., YoLo, SSD300 and SSD512) and used the KITTI, Stanford Cars, Berkeley DeepDrive, and NEXET datasets. Results showed the statistically significant negative correlation between prediction surface uncertainty and prediction accuracy suggesting that uncertainty significantly impacts the decisions made by autonomous driving.

4.3SPMay 9, 2021

Security Concerns on Machine Learning Solutions for 6G Networks in mmWave Beam Prediction

Ferhat Ozgur Catak, Evren Catak, Murat Kuzlu et al.

6G -- sixth generation -- is the latest cellular technology currently under development for wireless communication systems. In recent years, machine learning algorithms have been applied widely in various fields, such as healthcare, transportation, energy, autonomous car, and many more. Those algorithms have been also using in communication technologies to improve the system performance in terms of frequency spectrum usage, latency, and security. With the rapid developments of machine learning techniques, especially deep learning, it is critical to take the security concern into account when applying the algorithms. While machine learning algorithms offer significant advantages for 6G networks, security concerns on Artificial Intelligent (AI) models is typically ignored by the scientific community so far. However, security is also a vital part of the AI algorithms, this is because the AI model itself can be poisoned by attackers. This paper proposes a mitigation method for adversarial attacks against proposed 6G machine learning models for the millimeter-wave (mmWave) beam prediction using adversarial learning. The main idea behind adversarial attacks against machine learning models is to produce faulty results by manipulating trained deep learning models for 6G applications for mmWave beam prediction. We also present the adversarial learning mitigation method's performance for 6G security in mmWave beam prediction application with fast gradient sign method attack. The mean square errors (MSE) of the defended model under attack are very close to the undefended model without attack.

5.5LGMar 12, 2021

Adversarial Machine Learning Security Problems for 6G: mmWave Beam Prediction Use-Case

Evren Catak, Ferhat Ozgur Catak, Arild Moldsvor

6G is the next generation for the communication systems. In recent years, machine learning algorithms have been applied widely in various fields such as health, transportation, and the autonomous car. The predictive algorithms will be used in 6G problems. With the rapid developments of deep learning techniques, it is critical to take the security concern into account to apply the algorithms. While machine learning offers significant advantages for 6G, AI models' security is ignored. Since it has many applications in the real world, security is a vital part of the algorithms. This paper has proposed a mitigation method for adversarial attacks against proposed 6G machine learning models for the millimeter-wave (mmWave) beam prediction with adversarial learning. The main idea behind adversarial attacks against machine learning models is to produce faulty results by manipulating trained deep learning models for 6G applications for mmWave beam prediction use case. We have also presented the adversarial learning mitigation method's performance for 6G security in millimeter-wave beam prediction application with fast gradient sign method attack. The mean square errors of the defended model and undefended model are very close.

12.5LGFeb 8, 2021

Exploiting epistemic uncertainty of the deep learning models to generate adversarial samples

Omer Faruk Tuna, Ferhat Ozgur Catak, M. Taner Eskil

Deep neural network architectures are considered to be robust to random perturbations. Nevertheless, it was shown that they could be severely vulnerable to slight but carefully crafted perturbations of the input, termed as adversarial samples. In recent years, numerous studies have been conducted in this new area called "Adversarial Machine Learning" to devise new adversarial attacks and to defend against these attacks with more robust DNN architectures. However, almost all the research work so far has been concentrated on utilising model loss function to craft adversarial examples or create robust models. This study explores the usage of quantified epistemic uncertainty obtained from Monte-Carlo Dropout Sampling for adversarial attack purposes by which we perturb the input to the areas where the model has not seen before. We proposed new attack ideas based on the epistemic uncertainty of the model. Our results show that our proposed hybrid attack approach increases the attack success rates from 82.59% to 85.40%, 82.86% to 89.92% and 88.06% to 90.03% on MNIST Digit, MNIST Fashion and CIFAR-10 datasets, respectively.

1.2SYJan 19, 2021

Internet of Predictable Things (IoPT) Framework to Increase Cyber-Physical System Resiliency

Umit Cali, Murat Kuzlu, Vinayak Sharma et al.

During the last two decades, distributed energy systems, especially renewable energy sources (RES), have become more economically viable with increasing market share and penetration levels on power systems. In addition to decarbonization and decentralization of energy systems, digitalization has also become very important. The use of artificial intelligence (AI), advanced optimization algorithms, Industrial Internet of Things (IIoT), and other digitalization frameworks makes modern power system assets more intelligent, while vulnerable to cybersecurity risks. This paper proposes the concept of the Internet of Predictable Things (IoPT) that incorporates advanced data analytics and machine learning methods to increase the resiliency of cyber-physical systems against cybersecurity risks. The proposed concept is demonstrated using a cyber-physical system testbed under a variety of cyber attack scenarios as a proof of concept (PoC).

5.0LGDec 11, 2020Code

Closeness and Uncertainty Aware Adversarial Examples Detection in Adversarial Machine Learning

Omer Faruk Tuna, Ferhat Ozgur Catak, M. Taner Eskil

While state-of-the-art Deep Neural Network (DNN) models are considered to be robust to random perturbations, it was shown that these architectures are highly vulnerable to deliberately crafted perturbations, albeit being quasi-imperceptible. These vulnerabilities make it challenging to deploy DNN models in security-critical areas. In recent years, many research studies have been conducted to develop new attack methods and come up with new defense techniques that enable more robust and reliable models. In this work, we explore and assess the usage of different type of metrics for detecting adversarial samples. We first leverage the usage of moment-based predictive uncertainty estimates of a DNN classifier obtained using Monte-Carlo Dropout Sampling. And we also introduce a new method that operates in the subspace of deep features extracted by the model. We verified the effectiveness of our approach on a range of standard datasets like MNIST (Digit), MNIST (Fashion) and CIFAR-10. Our experiments show that these two different approaches complement each other, and the combined usage of all the proposed metrics yields up to 99 \% ROC-AUC scores regardless of the attack algorithm.

16.2CROct 5, 2020Code

Data Augmentation Based Malware Detection using Convolutional Neural Networks

Ferhat Ozgur Catak, Javed Ahmed, Kevser Sahinbas et al.

Recently, cyber-attacks have been extensively seen due to the everlasting increase of malware in the cyber world. These attacks cause irreversible damage not only to end-users but also to corporate computer systems. Ransomware attacks such as WannaCry and Petya specifically targets to make critical infrastructures such as airports and rendered operational processes inoperable. Hence, it has attracted increasing attention in terms of volume, versatility, and intricacy. The most important feature of this type of malware is that they change shape as they propagate from one computer to another. Since standard signature-based detection software fails to identify this type of malware because they have different characteristics on each contaminated computer. This paper aims at providing an image augmentation enhanced deep convolutional neural network (CNN) models for the detection of malware families in a metamorphic malware environment. The main contributions of the paper's model structure consist of three components, including image generation from malware samples, image augmentation, and the last one is classifying the malware families by using a convolutional neural network model. In the first component, the collected malware samples are converted binary representation to 3-channel images using windowing technique. The second component of the system create the augmented version of the images, and the last component builds a classification model. In this study, five different deep convolutional neural network model for malware family detection is used.

14.8CRMay 6, 2019Code

A Benchmark API Call Dataset for Windows PE Malware Classification

Ferhat Ozgur Catak, Ahmet Faruk Yazı

The use of operating system API calls is a promising task in the detection of PE-type malware in the Windows operating system. This task is officially defined as running malware in an isolated sandbox environment, recording the API calls made with the Windows operating system and sequentially analyzing these calls. Here, we have analyzed 7107 different malicious software belonging to various families such as virus, backdoor, trojan in an isolated sandbox environment and transformed these analysis results into a format where different classification algorithms and methods can be used. First, we'll explain how we got the malware, and then we'll explain how we've got these software bundled into families. Finally, we will describe how to perform malware classification tasks using different computational methods for the researchers who will use the data set we have created.

3.1CRNov 7, 2016

Privacy Preserving PageRank Algorithm By Using Secure Multi-Party Computation

Ferhat Ozgur Catak

In this work, we study the problem of privacy preserving computation on PageRank algorithm. The idea is to enforce the secure multi party computation of the algorithm iteratively using homomorphic encryption based on Paillier scheme. In the proposed PageRank computation, a user encrypt its own graph data using asymmetric encryption method, sends the data set into different parties in a privacy-preserving manner. Each party computes its own encrypted entity, but learns nothing about the data at other parties.

1.0LGFeb 9, 2016

Robust Ensemble Classifier Combination Based on Noise Removal with One-Class SVM

Ferhat Özgür Çatak

In machine learning area, as the number of labeled input samples becomes very large, it is very difficult to build a classification model because of input data set is not fit in a memory in training phase of the algorithm, therefore, it is necessary to utilize data partitioning to handle overall data set. Bagging and boosting based data partitioning methods have been broadly used in data mining and pattern recognition area. Both of these methods have shown a great possibility for improving classification model performance. This study is concerned with the analysis of data set partitioning with noise removal and its impact on the performance of multiple classifier models. In this study, we propose noise filtering preprocessing at each data set partition to increment classifier model performance. We applied Gini impurity approach to find the best split percentage of noise filter ratio. The filtered sub data set is then used to train individual ensemble models.

1.9LGFeb 9, 2016

Classification with Boosting of Extreme Learning Machine Over Arbitrarily Partitioned Data

Ferhat Özgür Çatak

Machine learning based computational intelligence methods are widely used to analyze large scale data sets in this age of big data. Extracting useful predictive modeling from these types of data sets is a challenging problem due to their high complexity. Analyzing large amount of streaming data that can be leveraged to derive business value is another complex problem to solve. With high levels of data availability (\textit{i.e. Big Data}) automatic classification of them has become an important and complex task. Hence, we explore the power of applying MapReduce based Distributed AdaBoosting of Extreme Learning Machine (ELM) to build a predictive bag of classification models. Accordingly, (i) data set ensembles are created; (ii) ELM algorithm is used to build weak learners (classifier functions); and (iii) builds a strong learner from a set of weak learners. We applied this training model to the benchmark knowledge discovery and data mining data sets.

1.1LGApr 12, 2015

Classification with Extreme Learning Machine and Ensemble Algorithms Over Randomly Partitioned Data

Ferhat Özgür Çatak

In this age of Big Data, machine learning based data mining methods are extensively used to inspect large scale data sets. Deriving applicable predictive modeling from these type of data sets is a challenging obstacle because of their high complexity. Opportunity with high data availability levels, automated classification of data sets has become a critical and complicated function. In this paper, the power of applying MapReduce based Distributed AdaBoosting of Extreme Learning Machine (ELM) are explored to build reliable predictive bag of classification models. Thus, (i) dataset ensembles are build; (ii) ELM algorithm is used to build weak classification models; and (iii) build a strong classification model from a set of weak classification models. This training model is applied to the publicly available knowledge discovery and data mining datasets.

1.4LGOct 10, 2014

Polarization Measurement of High Dimensional Social Media Messages With Support Vector Machine Algorithm Using Mapreduce

Ferhat Özgür Çatak

In this article, we propose a new Support Vector Machine (SVM) training algorithm based on distributed MapReduce technique. In literature, there are a lots of research that shows us SVM has highest generalization property among classification algorithms used in machine learning area. Also, SVM classifier model is not affected by correlations of the features. But SVM uses quadratic optimization techniques in its training phase. The SVM algorithm is formulated as quadratic optimization problem. Quadratic optimization problem has $O(m^3)$ time and $O(m^2)$ space complexity, where m is the training set size. The computation time of SVM training is quadratic in the number of training instances. In this reason, SVM is not a suitable classification algorithm for large scale dataset classification. To solve this training problem we developed a new distributed MapReduce method developed. Accordingly, (i) SVM algorithm is trained in distributed dataset individually; (ii) then merge all support vectors of classifier model in every trained node; and (iii) iterate these two steps until the classifier model converges to the optimal classifier function. In the implementation phase, large scale social media dataset is presented in TFxIDF matrix. The matrix is used for sentiment analysis to get polarization value. Two and three class models are created for classification method. Confusion matrices of each classification model are presented in tables. Social media messages corpus consists of 108 public and 66 private universities messages in Turkey. Twitter is used for source of corpus. Twitter user messages are collected using Twitter Streaming API. Results are shown in graphics and tables.

2.9LGDec 15, 2013

A MapReduce based distributed SVM algorithm for binary classification

Ferhat Özgür Çatak, Mehmet Erdal Balaban

Although Support Vector Machine (SVM) algorithm has a high generalization property to classify for unseen examples after training phase and it has small loss value, the algorithm is not suitable for real-life classification and regression problems. SVMs cannot solve hundreds of thousands examples in training dataset. In previous studies on distributed machine learning algorithms, SVM is trained over a costly and preconfigured computer environment. In this research, we present a MapReduce based distributed parallel SVM training algorithm for binary classification problems. This work shows how to distribute optimization problem over cloud computing systems with MapReduce technique. In the second step of this work, we used statistical learning theory to find the predictive hypothesis that minimize our empirical risks from hypothesis spaces that created with reduce function of MapReduce. The results of this research are important for training of big datasets for SVM algorithm based classification problems. We provided that iterative training of split dataset with MapReduce technique; accuracy of the classifier function will converge to global optimal classifier function's accuracy in finite iteration size. The algorithm performance was measured on samples from letter recognition and pen-based recognition of handwritten digits dataset.

6.3LGJan 1, 2013

CloudSVM : Training an SVM Classifier in Cloud Computing Systems

F. Ozgur Catak, M. Erdal Balaban

In conventional method, distributed support vector machines (SVM) algorithms are trained over pre-configured intranet/internet environments to find out an optimal classifier. These methods are very complicated and costly for large datasets. Hence, we propose a method that is referred as the Cloud SVM training mechanism (CloudSVM) in a cloud computing environment with MapReduce technique for distributed machine learning applications. Accordingly, (i) SVM algorithm is trained in distributed cloud storage servers that work concurrently; (ii) merge all support vectors in every trained cloud node; and (iii) iterate these two steps until the SVM converges to the optimal classifier function. Large scale data sets are not possible to train using SVM algorithm on a single computer. The results of this study are important for training of large scale data sets for machine learning applications. We provided that iterative training of splitted data set in cloud computing environment using SVM will converge to a global optimal classifier in finite iteration size.