Serkan Kiranyaz

IV
h-index51
81papers
6,309citations
Novelty47%
AI Score49

81 Papers

IVSep 29, 2022Code
R2C-GAN: Restore-to-Classify Generative Adversarial Networks for Blind X-Ray Restoration and COVID-19 Classification

Mete Ahishali, Aysen Degerli, Serkan Kiranyaz et al.

Restoration of poor quality images with a blended set of artifacts plays a vital role for a reliable diagnosis. Existing studies have focused on specific restoration problems such as image deblurring, denoising, and exposure correction where there is usually a strong assumption on the artifact type and severity. As a pioneer study in blind X-ray restoration, we propose a joint model for generic image restoration and classification: Restore-to-Classify Generative Adversarial Networks (R2C-GANs). Such a jointly optimized model keeps any disease intact after the restoration. Therefore, this will naturally lead to a higher diagnosis performance thanks to the improved X-ray image quality. To accomplish this crucial objective, we define the restoration task as an Image-to-Image translation problem from poor quality having noisy, blurry, or over/under-exposed images to high quality image domain. The proposed R2C-GAN model is able to learn forward and inverse transforms between the two domains using unpaired training samples. Simultaneously, the joint classification preserves the disease label during restoration. Moreover, the R2C-GANs are equipped with operational layers/neurons reducing the network depth and further boosting both restoration and classification performances. The proposed joint model is extensively evaluated over the QaTa-COV19 dataset for Coronavirus Disease 2019 (COVID-19) classification. The proposed restoration approach achieves over 90% F1-Score which is significantly higher than the performance of any deep model. Moreover, in the qualitative analysis, the restoration performance of R2C-GANs is approved by a group of medical doctors. We share the software implementation at https://github.com/meteahishali/R2C-GAN.

CVJul 12, 2023Code
Operational Support Estimator Networks

Mete Ahishali, Mehmet Yamac, Serkan Kiranyaz et al.

In this work, we propose a novel approach called Operational Support Estimator Networks (OSENs) for the support estimation task. Support Estimation (SE) is defined as finding the locations of non-zero elements in sparse signals. By its very nature, the mapping between the measurement and sparse signal is a non-linear operation. Traditional support estimators rely on computationally expensive iterative signal recovery techniques to achieve such non-linearity. Contrary to the convolutional layers, the proposed OSEN approach consists of operational layers that can learn such complex non-linearities without the need for deep networks. In this way, the performance of non-iterative support estimation is greatly improved. Moreover, the operational layers comprise so-called generative super neurons with non-local kernels. The kernel location for each neuron/feature map is optimized jointly for the SE task during training. We evaluate the OSENs in three different applications: i. support estimation from Compressive Sensing (CS) measurements, ii. representation-based classification, and iii. learning-aided CS reconstruction where the output of OSENs is used as prior knowledge to the CS algorithm for enhanced reconstruction. Experimental results show that the proposed approach achieves computational efficiency and outperforms competing methods, especially at low measurement rates by significant margins. The software implementation is shared at https://github.com/meteahishali/OSEN.

LGApr 7, 2022
Global ECG Classification by Self-Operational Neural Networks with Feature Injection

Muhammad Uzair Zahid, Serkan Kiranyaz, Moncef Gabbouj

Objective: Global (inter-patient) ECG classification for arrhythmia detection over Electrocardiogram (ECG) signal is a challenging task for both humans and machines. The main reason is the significant variations of both normal and arrhythmic ECG patterns among patients. Automating this process with utmost accuracy is, therefore, highly desirable due to the advent of wearable ECG sensors. However, even with numerous deep learning approaches proposed recently, there is still a notable gap in the performance of global and patient-specific ECG classification performances. This study proposes a novel approach to narrow this gap and propose a real-time solution with shallow and compact 1D Self-Organized Operational Neural Networks (Self-ONNs). Methods: In this study, we propose a novel approach for inter-patient ECG classification using a compact 1D Self-ONN by exploiting morphological and timing information in heart cycles. We used 1D Self-ONN layers to automatically learn morphological representations from ECG data, enabling us to capture the shape of the ECG waveform around the R peaks. We further inject temporal features based on RR interval for timing characterization. The classification layers can thus benefit from both temporal and learned features for the final arrhythmia classification. Results: Using the MIT-BIH arrhythmia benchmark database, the proposed method achieves the highest classification performance ever achieved, i.e., 99.21% precision, 99.10% recall, and 99.15% F1-score for normal (N) segments; 82.19% precision, 82.50% recall, and 82.34% F1-score for the supra-ventricular ectopic beat (SVEBs); and finally, 94.41% precision, 96.10% recall, and 95.2% F1-score for the ventricular-ectopic beats (VEBs).

LGDec 7, 2022
Zero-Shot Transfer Learning for Structural Health Monitoring using Generative Adversarial Networks and Spectral Mapping

Mohammad Hesam Soleimani-Babakamali, Roksana Soleimani-Babakamali, Kourosh Nasrollahzadeh et al.

Gathering properly labelled, adequately rich, and case-specific data for successfully training a data-driven or hybrid model for structural health monitoring (SHM) applications is a challenging task. We posit that a Transfer Learning (TL) method that utilizes available data in any relevant source domain and directly applies to the target domain through domain adaptation can provide substantial remedies to address this issue. Accordingly, we present a novel TL method that differentiates between the source's no-damage and damage cases and utilizes a domain adaptation (DA) technique. The DA module transfers the accumulated knowledge in contrasting no-damage and damage cases in the source domain to the target domain, given only the target's no-damage case. High-dimensional features allow employing signal processing domain knowledge to devise a generalizable DA approach. The Generative Adversarial Network (GAN) architecture is adopted for learning since its optimization process accommodates high-dimensional inputs in a zero-shot setting. At the same time, its training objective conforms seamlessly with the case of no-damage and damage data in SHM since its discriminator network differentiates between real (no damage) and fake (possibly unseen damage) data. An extensive set of experimental results demonstrates the method's success in transferring knowledge on differences between no-damage and damage cases across three strongly heterogeneous independent target structures. The area under the Receiver Operating Characteristics curves (Area Under the Curve - AUC) is used to evaluate the differentiation between no-damage and damage cases in the target domain, reaching values as high as 0.95. With no-damage and damage cases discerned from each other, zero-shot structural damage detection is carried out. The mean F1 scores for all damages in the three independent datasets are 0.978, 0.992, and 0.975.

SPSep 25, 2023
A Novel Deep Learning Technique for Morphology Preserved Fetal ECG Extraction from Mother ECG using 1D-CycleGAN

Promit Basak, A. H. M Nazmus Sakib, Muhammad E. H. Chowdhury et al.

Monitoring the electrical pulse of fetal heart through a non-invasive fetal electrocardiogram (fECG) can easily detect abnormalities in the developing heart to significantly reduce the infant mortality rate and post-natal complications. Due to the overlapping of maternal and fetal R-peaks, the low amplitude of the fECG, systematic and ambient noises, typical signal extraction methods, such as adaptive filters, independent component analysis, empirical mode decomposition, etc., are unable to produce satisfactory fECG. While some techniques can produce accurate QRS waves, they often ignore other important aspects of the ECG. Our approach, which is based on 1D CycleGAN, can reconstruct the fECG signal from the mECG signal while maintaining the morphology due to extensive preprocessing and appropriate framework. The performance of our solution was evaluated by combining two available datasets from Physionet, "Abdominal and Direct Fetal ECG Database" and "Fetal electrocardiograms, direct and abdominal with reference heartbeat annotations", where it achieved an average PCC and Spectral-Correlation score of 88.4% and 89.4%, respectively. It detects the fQRS of the signal with accuracy, precision, recall and F1 score of 92.6%, 97.6%, 94.8% and 96.4%, respectively. It can also accurately produce the estimation of fetal heart rate and R-R interval with an error of 0.25% and 0.27%, respectively. The main contribution of our work is that, unlike similar studies, it can retain the morphology of the ECG signal with high fidelity. The accuracy of our solution for fetal heart rate and R-R interval length is comparable to existing state-of-the-art techniques. This makes it a highly effective tool for early diagnosis of fetal heart diseases and regular health checkups of the fetus.

IVMar 18, 2021
Advance Warning Methodologies for COVID-19 using Chest X-Ray Images

Mete Ahishali, Aysen Degerli, Mehmet Yamac et al.

Coronavirus disease 2019 (COVID-19) has rapidly become a global health concern after its first known detection in December 2019. As a result, accurate and reliable advance warning system for the early diagnosis of COVID-19 has now become a priority. The detection of COVID-19 in early stages is not a straightforward task from chest X-ray images according to expert medical doctors because the traces of the infection are visible only when the disease has progressed to a moderate or severe stage. In this study, our first aim is to evaluate the ability of recent \textit{state-of-the-art} Machine Learning techniques for the early detection of COVID-19 from chest X-ray images. Both compact classifiers and deep learning approaches are considered in this study. Furthermore, we propose a recent compact classifier, Convolutional Support Estimator Network (CSEN) approach for this purpose since it is well-suited for a scarce-data classification task. Finally, this study introduces a new benchmark dataset called Early-QaTa-COV19, which consists of 1065 early-stage COVID-19 pneumonia samples (very limited or no infection signs) labelled by the medical doctors and 12 544 samples for control (normal) class. A detailed set of experiments shows that the CSEN achieves the top (over 97%) sensitivity with over 95.5% specificity. Moreover, DenseNet-121 network produces the leading performance among other deep networks with 95% sensitivity and 99.74% specificity.

LGDec 12, 2022
Zero-Shot Motor Health Monitoring by Blind Domain Transition

Serkan Kiranyaz, Ozer Can Devecioglu, Amir Alhams et al.

Continuous long-term monitoring of motor health is crucial for the early detection of abnormalities such as bearing faults (up to 51% of motor failures are attributed to bearing faults). Despite numerous methodologies proposed for bearing fault detection, most of them require normal (healthy) and abnormal (faulty) data for training. Even with the recent deep learning (DL) methodologies trained on the labeled data from the same machine, the classification accuracy significantly deteriorates when one or few conditions are altered. Furthermore, their performance suffers significantly or may entirely fail when they are tested on another machine with entirely different healthy and faulty signal patterns. To address this need, in this pilot study, we propose a zero-shot bearing fault detection method that can detect any fault on a new (target) machine regardless of the working conditions, sensor parameters, or fault characteristics. To accomplish this objective, a 1D Operational Generative Adversarial Network (Op-GAN) first characterizes the transition between normal and fault vibration signals of (a) source machine(s) under various conditions, sensor parameters, and fault types. Then for a target machine, the potential faulty signals can be generated, and over its actual healthy and synthesized faulty signals, a compact, and lightweight 1D Self-ONN fault detector can then be trained to detect the real faulty condition in real time whenever it occurs. To validate the proposed approach, a new benchmark dataset is created using two different motors working under different conditions and sensor locations. Experimental results demonstrate that this novel approach can accurately detect any bearing fault achieving an average recall rate of around 89% and 95% on two target machines regardless of its type, severity, and location.

LGApr 9, 2022
Motion Artifacts Correction from Single-Channel EEG and fNIRS Signals using Novel Wavelet Packet Decomposition in Combination with Canonical Correlation Analysis

Md Shafayet Hossain, Muhammad E. H. Chowdhury, Mamun Bin Ibne Reaz et al.

The electroencephalogram (EEG) and functional near-infrared spectroscopy (fNIRS) signals, highly non-stationary in nature, greatly suffers from motion artifacts while recorded using wearable sensors. This paper proposes two robust methods: i) Wavelet packet decomposition (WPD), and ii) WPD in combination with canonical correlation analysis (WPD-CCA), for motion artifact correction from single-channel EEG and fNIRS signals. The efficacy of these proposed techniques is tested using a benchmark dataset and the performance of the proposed methods is measured using two well-established performance matrices: i) Difference in the signal to noise ratio (ΔSNR) and ii) Percentage reduction in motion artifacts (η). The proposed WPD-based single-stage motion artifacts correction technique produces the highest average ΔSNR (29.44 dB) when db2 wavelet packet is incorporated whereas the greatest average η (53.48%) is obtained using db1 wavelet packet for all the available 23 EEG recordings. Our proposed two-stage motion artifacts correction technique i.e. the WPD-CCA method utilizing db1 wavelet packet has shown the best denoising performance producing an average ΔSNR and η values of 30.76 dB and 59.51%, respectively for all the EEG recordings. On the other hand, the two-stage motion artifacts removal technique i.e. WPD-CCA has produced the best average ΔSNR (16.55 dB, utilizing db1 wavelet packet) and largest average η (41.40%, using fk8 wavelet packet). The highest average ΔSNR and η using single-stage artifacts removal techniques (WPD) are found as 16.11 dB and 26.40%, respectively for all the fNIRS signals using fk4 wavelet packet. In both EEG and fNIRS modalities, the percentage reduction in motion artifacts increases by 11.28% and 56.82%, respectively when two-stage WPD-CCA techniques are employed.

IVApr 14, 2022
Early Myocardial Infarction Detection with One-Class Classification over Multi-view Echocardiography

Aysen Degerli, Fahad Sohrab, Serkan Kiranyaz et al.

Myocardial infarction (MI) is the leading cause of mortality and morbidity in the world. Early therapeutics of MI can ensure the prevention of further myocardial necrosis. Echocardiography is the fundamental imaging technique that can reveal the earliest sign of MI. However, the scarcity of echocardiographic datasets for the MI detection is the major issue for training data-driven classification algorithms. In this study, we propose a framework for early detection of MI over multi-view echocardiography that leverages one-class classification (OCC) techniques. The OCC techniques are used to train a model for detecting a specific target class using instances from that particular category only. We investigated the usage of uni-modal and multi-modal one-class classification techniques in the proposed framework using the HMC-QU dataset that includes apical 4-chamber (A4C) and apical 2-chamber (A2C) views in a total of 260 echocardiography recordings. Experimental results show that the multi-modal approach achieves a sensitivity level of 85.23% and F1-Score of 80.21%.

LGNov 30, 2023
Real-Time Vibration-Based Bearing Fault Diagnosis Under Time-Varying Speed Conditions

Tuomas Jalonen, Mohammad Al-Sa'd, Serkan Kiranyaz et al.

Detection of rolling-element bearing faults is crucial for implementing proactive maintenance strategies and for minimizing the economic and operational consequences of unexpected failures. However, many existing techniques are developed and tested under strictly controlled conditions, limiting their adaptability to the diverse and dynamic settings encountered in practical applications. This paper presents an efficient real-time convolutional neural network (CNN) for diagnosing multiple bearing faults under various noise levels and time-varying rotational speeds. Additionally, we propose a novel Fisher-based spectral separability analysis (SSA) method to elucidate the effectiveness of the designed CNN model. We conducted experiments on both healthy bearings and bearings afflicted with inner race, outer race, and roller ball faults. The experimental results show the superiority of our model over the current state-of-the-art approach in three folds: it achieves substantial accuracy gains of up to 15.8%, it is robust to noise with high performance across various signal-to-noise ratios, and it runs in real-time with processing durations five times less than acquisition. Additionally, by using the proposed SSA technique, we offer insights into the model's performance and underscore its effectiveness in tackling real-world challenges.

LGJul 14, 2022
A Personalized Zero-Shot ECG Arrhythmia Monitoring System: From Sparse Representation Based Domain Adaption to Energy Efficient Abnormal Beat Detection for Practical ECG Surveillance

Mehmet Yamaç, Mert Duman, İlke Adalıoğlu et al.

This paper proposes a low-cost and highly accurate ECG-monitoring system intended for personalized early arrhythmia detection for wearable mobile sensors. Earlier supervised approaches for personalized ECG monitoring require both abnormal and normal heartbeats for the training of the dedicated classifier. However, in a real-world scenario where the personalized algorithm is embedded in a wearable device, such training data is not available for healthy people with no cardiac disorder history. In this study, (i) we propose a null space analysis on the healthy signal space obtained via sparse dictionary learning, and investigate how a simple null space projection or alternatively regularized least squares-based classification methods can reduce the computational complexity, without sacrificing the detection accuracy, when compared to sparse representation-based classification. (ii) Then we introduce a sparse representation-based domain adaptation technique in order to project other existing users' abnormal and normal signals onto the new user's signal space, enabling us to train the dedicated classifier without having any abnormal heartbeat of the new user. Therefore, zero-shot learning can be achieved without the need for synthetic abnormal heartbeat generation. An extensive set of experiments performed on the benchmark MIT-BIH ECG dataset shows that when this domain adaptation-based training data generator is used with a simple 1-D CNN classifier, the method outperforms the prior work by a significant margin. (iii) Then, by combining (i) and (ii), we propose an ensemble classifier that further improves the performance. This approach for zero-shot arrhythmia detection achieves an average accuracy level of 98.2% and an F1-Score of 92.8%. Finally, a personalized energy-efficient ECG monitoring scheme is proposed using the above-mentioned innovations.

LGSep 27, 2023
SAF-Net: Self-Attention Fusion Network for Myocardial Infarction Detection using Multi-View Echocardiography

Ilke Adalioglu, Mete Ahishali, Aysen Degerli et al.

Myocardial infarction (MI) is a severe case of coronary artery disease (CAD) and ultimately, its detection is substantial to prevent progressive damage to the myocardium. In this study, we propose a novel view-fusion model named self-attention fusion network (SAF-Net) to detect MI from multi-view echocardiography recordings. The proposed framework utilizes apical 2-chamber (A2C) and apical 4-chamber (A4C) view echocardiography recordings for classification. Three reference frames are extracted from each recording of both views and deployed pre-trained deep networks to extract highly representative features. The SAF-Net model utilizes a self-attention mechanism to learn dependencies in extracted feature vectors. The proposed model is computationally efficient thanks to its compact architecture having three main parts: a feature embedding to reduce dimensionality, self-attention for view-pooling, and dense layers for the classification. Experimental evaluation is performed using the HMC-QU-TAU dataset which consists of 160 patients with A2C and A4C view echocardiography recordings. The proposed SAF-Net model achieves a high-performance level with 88.26% precision, 77.64% sensitivity, and 78.13% accuracy. The results demonstrate that the SAF-Net model achieves the most accurate MI detection over multi-view echocardiography recordings.

SDDec 30, 2022
Blind Restoration of Real-World Audio by 1D Operational GANs

Turker Ince, Serkan Kiranyaz, Ozer Can Devecioglu et al.

Objective: Despite numerous studies proposed for audio restoration in the literature, most of them focus on an isolated restoration problem such as denoising or dereverberation, ignoring other artifacts. Moreover, assuming a noisy or reverberant environment with limited number of fixed signal-to-distortion ratio (SDR) levels is a common practice. However, real-world audio is often corrupted by a blend of artifacts such as reverberation, sensor noise, and background audio mixture with varying types, severities, and duration. In this study, we propose a novel approach for blind restoration of real-world audio signals by Operational Generative Adversarial Networks (Op-GANs) with temporal and spectral objective metrics to enhance the quality of restored audio signal regardless of the type and severity of each artifact corrupting it. Methods: 1D Operational-GANs are used with generative neuron model optimized for blind restoration of any corrupted audio signal. Results: The proposed approach has been evaluated extensively over the benchmark TIMIT-RAR (speech) and GTZAN-RAR (non-speech) datasets corrupted with a random blend of artifacts each with a random severity to mimic real-world audio signals. Average SDR improvements of over 7.2 dB and 4.9 dB are achieved, respectively, which are substantial when compared with the baseline methods. Significance: This is a pioneer study in blind audio restoration with the unique capability of direct (time-domain) restoration of real-world audio whilst achieving an unprecedented level of performance for a wide SDR range and artifact types. Conclusion: 1D Op-GANs can achieve robust and computationally effective real-world audio restoration with significantly improved performance. The source codes and the generated real-world audio datasets are shared publicly with the research community in a dedicated GitHub repository1.

CVJul 17, 2022
2D Self-Organized ONN Model For Handwritten Text Recognition

Hanadi Hassen Mohammed, Junaid Malik, Somaya Al-Madeed et al.

Deep Convolutional Neural Networks (CNNs) have recently reached state-of-the-art Handwritten Text Recognition (HTR) performance. However, recent research has shown that typical CNNs' learning performance is limited since they are homogeneous networks with a simple (linear) neuron model. With their heterogeneous network structure incorporating non-linear neurons, Operational Neural Networks (ONNs) have recently been proposed to address this drawback. Self-ONNs are self-organized variations of ONNs with the generative neuron model that can generate any non-linear function using the Taylor approximation. In this study, in order to improve the state-of-the-art performance level in HTR, the 2D Self-organized ONNs (Self-ONNs) in the core of a novel network model are proposed. Moreover, deformable convolutions, which have recently been demonstrated to tackle variations in the writing styles better, are utilized in this study. The results over the IAM English dataset and HADARA80P Arabic dataset show that the proposed model with the operational layers of Self-ONNs significantly improves Character Error Rate (CER) and Word Error Rate (WER). Compared with its counterpart CNNs, Self-ONNs reduce CER and WER by 1.2% and 3.4 % in the HADARA80P and 0.199% and 1.244% in the IAM dataset. The results over the benchmark IAM demonstrate that the proposed model with the operational layers of Self-ONNs outperforms recent deep CNN models by a significant margin while the use of Self-ONNs with deformable convolutions demonstrates exceptional results.

LGMay 21, 2022
Evaluating Performance of Machine Learning Models for Diabetic Sensorimotor Polyneuropathy Severity Classification using Biomechanical Signals during Gait

Fahmida Haque, Mamun Bin Ibne Reaz, Muhammad Enamul Hoque Chowdhury et al.

Diabetic sensorimotor polyneuropathy (DSPN) is one of the prevalent forms of neuropathy affected by diabetic patients that involves alterations in biomechanical changes in human gait. In literature, for the last 50 years, researchers are trying to observe the biomechanical changes due to DSPN by studying muscle electromyography (EMG), and ground reaction forces (GRF). However, the literature is contradictory. In such a scenario, we are proposing to use Machine learning techniques to identify DSPN patients by using EMG, and GRF data. We have collected a dataset consists of three lower limb muscles EMG (tibialis anterior (TA), vastus lateralis (VL), gastrocnemius medialis (GM) and 3-dimensional GRF components (GRFx, GRFy, and GRFz). Raw EMG and GRF signals were preprocessed, and a newly proposed feature extraction technique scheme from literature was applied to extract the best features from the signals. The extracted feature list was ranked using Relief feature ranking techniques, and highly correlated features were removed. We have trained different ML models to find out the best-performing model and optimized that model. We trained the optimized ML models for different combinations of muscles and GRF components features, and the performance matrix was evaluated. This study has found ensemble classifier model was performing in identifying DSPN Severity, and we optimized it before training. For EMG analysis, we have found the best accuracy of 92.89% using the Top 14 features for features from GL, VL and TA muscles combined. In the GRF analysis, the model showed 94.78% accuracy by using the Top 15 features for the feature combinations extracted from GRFx, GRFy and GRFz signals. The performance of ML-based DSPN severity classification models, improved significantly, indicating their reliability in DSPN severity classification, for biomechanical data.

CVFeb 23, 2023
Real-Time Damage Detection in Fiber Lifting Ropes Using Lightweight Convolutional Neural Networks

Tuomas Jalonen, Mohammad Al-Sa'd, Roope Mellanen et al.

The health and safety hazards posed by worn crane lifting ropes mandate periodic inspection for damage. This task is time-consuming, prone to human error, halts operation, and may result in the premature disposal of ropes. Therefore, we propose using efficient deep learning and computer vision methods to automate the process of detecting damaged ropes. Specifically, we present a vision-based system for detecting damage in synthetic fiber rope images using lightweight convolutional neural networks. We develop a camera-based apparatus to photograph the lifting rope's surface, while in operation, and capture the progressive wear-and-tear as well as the more significant degradation in the rope's health state. Experts from Konecranes annotate the collected images in accordance with the rope's condition; normal or damaged. Then, we pre-process the images, systematically design a deep learning model, evaluate its detection and prediction performance, analyze its computational complexity, and compare it with various other models. Experimental results show the proposed model outperforms other similar techniques with 96.5% accuracy, 94.8% precision, 98.3% recall, 96.5% F1-score, and 99.3% AUC. Besides, they demonstrate the model's real-time operation, low memory footprint, robustness to various environmental and operational conditions, and adequacy for deployment in industrial applications such as lifting, mooring, towing, climbing, and sailing.

LGJul 18, 2024
BRSR-OpGAN: Blind Radar Signal Restoration using Operational Generative Adversarial Network

Muhammad Uzair Zahid, Serkan Kiranyaz, Alper Yildirim et al.

Objective: Many studies on radar signal restoration in the literature focus on isolated restoration problems, such as denoising over a certain type of noise, while ignoring other types of artifacts. Additionally, these approaches usually assume a noisy environment with a limited set of fixed signal-to-noise ratio (SNR) levels. However, real-world radar signals are often corrupted by a blend of artifacts, including but not limited to unwanted echo, sensor noise, intentional jamming, and interference, each of which can vary in type, severity, and duration. This study introduces Blind Radar Signal Restoration using an Operational Generative Adversarial Network (BRSR-OpGAN), which uses a dual domain loss in the temporal and spectral domains. This approach is designed to improve the quality of radar signals, regardless of the diversity and intensity of the corruption. Methods: The BRSR-OpGAN utilizes 1D Operational GANs, which use a generative neuron model specifically optimized for blind restoration of corrupted radar signals. This approach leverages GANs' flexibility to adapt dynamically to a wide range of artifact characteristics. Results: The proposed approach has been extensively evaluated using a well-established baseline and a newly curated extended dataset called the Blind Radar Signal Restoration (BRSR) dataset. This dataset was designed to simulate real-world conditions and includes a variety of artifacts, each varying in severity. The evaluation shows an average SNR improvement over 15.1 dB and 14.3 dB for the baseline and BRSR datasets, respectively. Finally, even on resource-constrained platforms, the proposed approach can be applied in real-time.

LGJul 6, 2024
High-Quality and Full Bandwidth Seismic Signal Synthesis using Operational GANs

Ozer Can Devecioglu, Serkan Kiranyaz, Zafer Yilmaz et al.

Vibration sensors are essential in acquiring seismic activity for an accurate earthquake assessment. The state-of-the-art sensors can provide the best signal quality and the highest bandwidth; however, their high cost usually hinders a wide range of applicability and coverage, which is otherwise possible with their basic and cheap counterparts. But, their poor quality and low bandwidth can significantly degrade the signal fidelity and result in an imprecise analysis. To address these drawbacks, in this study, we propose a novel, high-quality, and full bandwidth seismic signal synthesis by transforming the signal acquired from an inferior sensor. We employ 1D Operational Generative Adversarial Networks (Op-GANs) with novel loss functions to achieve this. Therefore, the study's key contributions include releasing a new dataset, addressing operational constraints in seismic monitoring, and pioneering a deep-learning transformation technique to create the first virtual seismic sensor. The proposed method is extensively evaluated over the Simulated Ground Motion (SimGM) benchmark dataset, and the results demonstrated that the proposed approach significantly improves the quality and bandwidth of seismic signals acquired from a variety of sensors, including a cheap seismic sensor, the CSN-Phidgets, and the integrated accelerometers of an Android, and iOS phone, to the same level as the state-of-the-art sensor (e.g., Kinemetrics-Episensor). The SimGM dataset, our results, and the optimized PyTorch implementation of the proposed approach are publicly shared.

LGMay 19
Axiomatizing Neural Networks via Pursuit of Subspaces

Mehmet Yamac, Mert Duman, Ugur Akpinar et al.

While deep neural networks have achieved remarkable success across a wide range of domains, their underlying mechanisms remain poorly understood, and they are often regarded as black boxes. This gap between empirical performance and theoretical understanding poses a challenge analogous to the pre-axiomatic stage of classical geometry. In this work, we introduce the Pursuit of Subspaces (PoS) hypothesis, an axiomatic framework that formulates neural network behavior through a set of geometric postulates. These axioms, together with their derived consequences, provide a unified perspective on representation, computation, and generalization in both shallow and deep architectures. We show that this framework yields geometric explanations for fundamental questions in deep learning, including representation structure, architectural mechanisms, and generalization behavior, offering a principled step toward a coherent theoretical foundation.

LGApr 17, 2025Code
Multiscale Tensor Summation Factorization as a New Neural Network Layer (MTS Layer) for Multidimensional Data Processing

Mehmet Yamaç, Muhammad Numan Yousaf, Serkan Kiranyaz et al.

Multilayer perceptrons (MLP), or fully connected artificial neural networks, are known for performing vector-matrix multiplications using learnable weight matrices; however, their practical application in many machine learning tasks, especially in computer vision, can be limited due to the high dimensionality of input-output pairs at each layer. To improve efficiency, convolutional operators have been utilized to facilitate weight sharing and local connections, yet they are constrained by limited receptive fields. In this paper, we introduce Multiscale Tensor Summation (MTS) Factorization, a novel neural network operator that implements tensor summation at multiple scales, where each tensor to be summed is obtained through Tucker-decomposition-like mode products. Unlike other tensor decomposition methods in the literature, MTS is not introduced as a network compression tool; instead, as a new backbone neural layer. MTS not only reduces the number of parameters required while enhancing the efficiency of weight optimization compared to traditional dense layers (i.e., unfactorized weight matrices in MLP layers), but it also demonstrates clear advantages over convolutional layers. The proof-of-concept experimental comparison of the proposed MTS networks with MLPs and Convolutional Neural Networks (CNNs) demonstrates their effectiveness across various tasks, such as classification, compression, and signal restoration. Additionally, when integrated with modern non-linear units such as the multi-head gate (MHG), also introduced in this study, the corresponding neural network, MTSNet, demonstrates a more favorable complexity-performance tradeoff compared to state-of-the-art transformers in various computer vision applications. The software implementation of the MTS layer and the corresponding MTS-based networks, MTSNets, is shared at https://github.com/mehmetyamac/MTSNet.

CVFeb 20, 2022Code
SRL-SOA: Self-Representation Learning with Sparse 1D-Operational Autoencoder for Hyperspectral Image Band Selection

Mete Ahishali, Serkan Kiranyaz, Iftikhar Ahmad et al.

The band selection in the hyperspectral image (HSI) data processing is an important task considering its effect on the computational complexity and accuracy. In this work, we propose a novel framework for the band selection problem: Self-Representation Learning (SRL) with Sparse 1D-Operational Autoencoder (SOA). The proposed SLR-SOA approach introduces a novel autoencoder model, SOA, that is designed to learn a representation domain where the data are sparsely represented. Moreover, the network composes of 1D-operational layers with the non-linear neuron model. Hence, the learning capability of neurons (filters) is greatly improved with shallow architectures. Using compact architectures is especially crucial in autoencoders as they tend to overfit easily because of their identity mapping objective. Overall, we show that the proposed SRL-SOA band selection approach outperforms the competing methods over two HSI data including Indian Pines and Salinas-A considering the achieved land cover classification accuracies. The software implementation of the SRL-SOA approach is shared publicly at https://github.com/meteahishali/SRL-SOA.

IVNov 9, 2021Code
Early Myocardial Infarction Detection over Multi-view Echocardiography

Aysen Degerli, Serkan Kiranyaz, Tahir Hamid et al.

Myocardial infarction (MI) is the leading cause of mortality in the world that occurs due to a blockage of the coronary arteries feeding the myocardium. An early diagnosis of MI and its localization can mitigate the extent of myocardial damage by facilitating early therapeutic interventions. Following the blockage of a coronary artery, the regional wall motion abnormality (RWMA) of the ischemic myocardial segments is the earliest change to set in. Echocardiography is the fundamental tool to assess any RWMA. Assessing the motion of the left ventricle (LV) wall only from a single echocardiography view may lead to missing the diagnosis of MI as the RWMA may not be visible on that specific view. Therefore, in this study, we propose to fuse apical 4-chamber (A4C) and apical 2-chamber (A2C) views in which a total of 12 myocardial segments can be analyzed for MI detection. The proposed method first estimates the motion of the LV wall by Active Polynomials (APs), which extract and track the endocardial boundary to compute myocardial segment displacements. The features are extracted from the A4C and A2C view displacements, which are concatenated and fed into the classifiers to detect MI. The main contributions of this study are 1) creation of a new benchmark dataset by including both A4C and A2C views in a total of 260 echocardiography recordings, which is publicly shared with the research community, 2) improving the performance of the prior work of threshold-based APs by a Machine Learning based approach, and 3) a pioneer MI detection approach via multi-view echocardiography by fusing the information of A4C and A2C views. Experimental results show that the proposed method achieves 90.91% sensitivity and 86.36% precision for MI detection over multi-view echocardiography. The software implementation is shared at https://github.com/degerliaysen/MultiEchoAI.

CVJun 27, 2021Code
Representation Based Regression for Object Distance Estimation

Mete Ahishali, Mehmet Yamac, Serkan Kiranyaz et al.

In this study, we propose a novel approach to predict the distances of the detected objects in an observed scene. The proposed approach modifies the recently proposed Convolutional Support Estimator Networks (CSENs). CSENs are designed to compute a direct mapping for the Support Estimation (SE) task in a representation-based classification problem. We further propose and demonstrate that representation-based methods (sparse or collaborative representation) can be used in well-designed regression problems. To the best of our knowledge, this is the first representation-based method proposed for performing a regression task by utilizing the modified CSENs; and hence, we name this novel approach as Representation-based Regression (RbR). The initial version of CSENs has a proxy mapping stage (i.e., a coarse estimation for the support set) that is required for the input. In this study, we improve the CSEN model by proposing Compressive Learning CSEN (CL-CSEN) that has the ability to jointly optimize the so-called proxy mapping stage along with convolutional layers. The experimental evaluations using the KITTI 3D Object Detection distance estimation dataset show that the proposed method can achieve a significantly improved distance estimation performance over all competing methods. Finally, the software implementations of the methods are publicly shared at https://github.com/meteahishali/CSENDistance.

CVFeb 10, 2025
Deep Learning in Automated Power Line Inspection: A Review

Md. Ahasan Atick Faisal, Imene Mecheter, Yazan Qiblawey et al.

In recent years, power line maintenance has seen a paradigm shift by moving towards computer vision-powered automated inspection. The utilization of an extensive collection of videos and images has become essential for maintaining the reliability, safety, and sustainability of electricity transmission. A significant focus on applying deep learning techniques for enhancing power line inspection processes has been observed in recent research. A comprehensive review of existing studies has been conducted in this paper, to aid researchers and industries in developing improved deep learning-based systems for analyzing power line data. The conventional steps of data analysis in power line inspections have been examined, and the body of current research has been systematically categorized into two main areas: the detection of components and the diagnosis of faults. A detailed summary of the diverse methods and techniques employed in these areas has been encapsulated, providing insights into their functionality and use cases. Special attention has been given to the exploration of deep learning-based methodologies for the analysis of power line inspection data, with an exposition of their fundamental principles and practical applications. Moreover, a vision for future research directions has been outlined, highlighting the need for advancements such as edge-cloud collaboration, and multi-modal analysis among others. Thus, this paper serves as a comprehensive resource for researchers delving into deep learning for power line analysis, illuminating the extent of current knowledge and the potential areas for future investigation.

LGJan 2, 2024
Quadratic Time-Frequency Analysis of Vibration Signals for Diagnosing Bearing Faults

Mohammad Al-Sa'd, Tuomas Jalonen, Serkan Kiranyaz et al.

Diagnosis of bearing faults is paramount to reducing maintenance costs and operational breakdowns. Bearing faults are primary contributors to machine vibrations, and analyzing their signal morphology offers insights into their health status. Unfortunately, existing approaches are optimized for controlled environments, neglecting realistic conditions such as time-varying rotational speeds and the vibration's non-stationary nature. This paper presents a fusion of time-frequency analysis and deep learning techniques to diagnose bearing faults under time-varying speeds and varying noise levels. First, we formulate the bearing fault-induced vibrations and discuss the link between their non-stationarity and the bearing's inherent and operational parameters. We also elucidate quadratic time-frequency distributions and validate their effectiveness in resolving distinctive dynamic patterns associated with different bearing faults. Based on this, we design a time-frequency convolutional neural network (TF-CNN) to diagnose various faults in rolling-element bearings. Our experimental findings undeniably demonstrate the superior performance of TF-CNN in comparison to recently developed techniques. They also assert its versatility in capturing fault-relevant non-stationary features that couple with speed changes and show its exceptional resilience to noise, consistently surpassing competing methods across various signal-to-noise ratios and performance metrics. Altogether, the TF-CNN achieves substantial accuracy improvements up to 15%, in severe noise conditions.

SDDec 17, 2023
Exploring Sound vs Vibration for Robust Fault Detection on Rotating Machinery

Serkan Kiranyaz, Ozer Can Devecioglu, Amir Alhams et al.

Robust and real-time detection of faults on rotating machinery has become an ultimate objective for predictive maintenance in various industries. Vibration-based Deep Learning (DL) methodologies have become the de facto standard for bearing fault detection as they can produce state-of-the-art detection performances under certain conditions. Despite such particular focus on the vibration signal, the utilization of sound, on the other hand, has been neglected whilst only a few studies have been proposed during the last two decades, all of which were based on a conventional ML approach. One major reason is the lack of a benchmark dataset providing a large volume of both vibration and sound data over several working conditions for different machines and sensor locations. In this study, we address this need by presenting the new benchmark Qatar University Dual-Machine Bearing Fault Benchmark dataset (QU-DMBF), which encapsulates sound and vibration data from two different motors operating under 1080 working conditions overall. Then we draw the focus on the major limitations and drawbacks of vibration-based fault detection due to numerous installation and operational conditions. Finally, we propose the first DL approach for sound-based fault detection and perform comparative evaluations between the sound and vibration over the QU-DMBF dataset. A wide range of experimental results shows that the sound-based fault detection method is significantly more robust than its vibration-based counterpart, as it is entirely independent of the sensor location, cost-effective (requiring no sensor and sensor maintenance), and can achieve the same level of the best detection performance by its vibration-based counterpart. With this study, the QU-DMBF dataset, the optimized source codes in PyTorch, and comparative evaluations are now publicly shared.

LGFeb 9, 2024
Refining Myocardial Infarction Detection: A Novel Multi-Modal Composite Kernel Strategy in One-Class Classification

Muhammad Uzair Zahid, Aysen Degerli, Fahad Sohrab et al.

Early detection of myocardial infarction (MI), a critical condition arising from coronary artery disease (CAD), is vital to prevent further myocardial damage. This study introduces a novel method for early MI detection using a one-class classification (OCC) algorithm in echocardiography. Our study overcomes the challenge of limited echocardiography data availability by adopting a novel approach based on Multi-modal Subspace Support Vector Data Description. The proposed technique involves a specialized MI detection framework employing multi-view echocardiography incorporating a composite kernel in the non-linear projection trick, fusing Gaussian and Laplacian sigmoid functions. Additionally, we enhance the update strategy of the projection matrices by adapting maximization for both or one of the modalities in the optimization process. Our method boosts MI detection capability by efficiently transforming features extracted from echocardiography data into an optimized lower-dimensional subspace. The OCC model trained specifically on target class instances from the comprehensive HMC-QU dataset that includes multiple echocardiography views indicates a marked improvement in MI detection accuracy. Our findings reveal that our proposed multi-view approach achieves a geometric mean of 71.24%, signifying a substantial advancement in echocardiography-based MI diagnosis and offering more precise and efficient diagnostic tools.

IVApr 14, 2025
Progressive Transfer Learning for Multi-Pass Fundus Image Restoration

Uyen Phan, Ozer Can Devecioglu, Serkan Kiranyaz et al.

Diabetic retinopathy is a leading cause of vision impairment, making its early diagnosis through fundus imaging critical for effective treatment planning. However, the presence of poor quality fundus images caused by factors such as inadequate illumination, noise, blurring and other motion artifacts yields a significant challenge for accurate DR screening. In this study, we propose progressive transfer learning for multi pass restoration to iteratively enhance the quality of degraded fundus images, ensuring more reliable DR screening. Unlike previous methods that often focus on a single pass restoration, multi pass restoration via PTL can achieve a superior blind restoration performance that can even improve most of the good quality fundus images in the dataset. Initially, a Cycle GAN model is trained to restore low quality images, followed by PTL induced restoration passes over the latest restored outputs to improve overall quality in each pass. The proposed method can learn blind restoration without requiring any paired data while surpassing its limitations by leveraging progressive learning and fine tuning strategies to minimize distortions and preserve critical retinal features. To evaluate PTL's effectiveness on multi pass restoration, we conducted experiments on DeepDRiD, a large scale fundus imaging dataset specifically curated for diabetic retinopathy detection. Our result demonstrates state of the art performance, showcasing PTL's potential as a superior approach to iterative image quality restoration.

CVDec 5, 2024
Blind Underwater Image Restoration using Co-Operational Regressor Networks

Ozer Can Devecioglu, Serkan Kiranyaz, Turker Ince et al.

The exploration of underwater environments is essential for applications such as biological research, archaeology, and infrastructure maintenanceHowever, underwater imaging is challenging due to the waters unique properties, including scattering, absorption, color distortion, and reduced visibility. To address such visual degradations, a variety of approaches have been proposed covering from basic signal processing methods to deep learning models; however, none of them has proven to be consistently successful. In this paper, we propose a novel machine learning model, Co-Operational Regressor Networks (CoRe-Nets), designed to achieve the best possible underwater image restoration. A CoRe-Net consists of two co-operating networks: the Apprentice Regressor (AR), responsible for image transformation, and the Master Regressor (MR), which evaluates the Peak Signal-to-Noise Ratio (PSNR) of the images generated by the AR and feeds it back to AR. CoRe-Nets are built on Self-Organized Operational Neural Networks (Self-ONNs), which offer a superior learning capability by modulating nonlinearity in kernel transformations. The effectiveness of the proposed model is demonstrated on the benchmark Large Scale Underwater Image (LSUI) dataset. Leveraging the joint learning capabilities of the two cooperating networks, the proposed model achieves the state-of-art restoration performance with significantly reduced computational complexity and often presents such results that can even surpass the visual quality of the ground truth with a 2-pass application. Our results and the optimized PyTorch implementation of the proposed approach are now publicly shared on GitHub.

CVJul 14, 2025
Expert Operational GANS: Towards Real-Color Underwater Image Restoration

Ozer Can Devecioglu, Serkan Kiranyaz, Mehmet Yamac et al.

The wide range of deformation artifacts that arise from complex light propagation, scattering, and depth-dependent attenuation makes the underwater image restoration to remain a challenging problem. Like other single deep regressor networks, conventional GAN-based restoration methods struggle to perform well across this heterogeneous domain, since a single generator network is typically insufficient to capture the full range of visual degradations. In order to overcome this limitation, we propose xOp-GAN, a novel GAN model with several expert generator networks, each trained solely on a particular subset with a certain image quality. Thus, each generator can learn to maximize its restoration performance for a particular quality range. Once a xOp-GAN is trained, each generator can restore the input image and the best restored image can then be selected by the discriminator based on its perceptual confidence score. As a result, xOP-GAN is the first GAN model with multiple generators where the discriminator is being used during the inference of the regression task. Experimental results on benchmark Large Scale Underwater Image (LSUI) dataset demonstrates that xOp-GAN achieves PSNR levels up to 25.16 dB, surpassing all single-regressor models by a large margin even, with reduced complexity.

LGMar 14, 2025
Semi-Supervised Co-Training of Time and Time-Frequency Models: Application to Bearing Fault Diagnosis

Tuomas Jalonen, Mohammad Al-Sa'd, Serkan Kiranyaz et al.

Neural networks require massive amounts of annotated data to train intelligent solutions. Acquiring many labeled data in industrial applications is often difficult; therefore, semi-supervised approaches are preferred. We propose a new semi-supervised co-training method, which combines time and time-frequency (TF) machine learning models to improve performance and reliability. The developed framework collaboratively co-trains fast time-domain models by utilizing high-performing TF techniques without increasing the inference complexity. Besides, it operates in cloud-edge networks and offers holistic support for many applications covering edge-real-time monitoring and cloud-based updates and corrections. Experimental results on bearing fault diagnosis verify the superiority of our technique compared to a competing self-training method. The results from two case studies show that our method outperforms self-training for different noise levels and amounts of available data with accuracy gains reaching from 10.6% to 33.9%. They demonstrate that fusing time-domain and TF-based models offers opportunities for developing high-performance industrial solutions.

LGJan 28, 2025
CoRe-Net: Co-Operational Regressor Network with Progressive Transfer Learning for Blind Radar Signal Restoration

Muhammad Uzair Zahid, Serkan Kiranyaz, Alper Yildirim et al.

Real-world radar signals are frequently corrupted by various artifacts, including sensor noise, echoes, interference, and intentional jamming, differing in type, severity, and duration. This pilot study introduces a novel model, called Co-Operational Regressor Network (CoRe-Net) for blind radar signal restoration, designed to address such limitations and drawbacks. CoRe-Net replaces adversarial training with a novel cooperative learning strategy, leveraging the complementary roles of its Apprentice Regressor (AR) and Master Regressor (MR). The AR restores radar signals corrupted by various artifacts, while the MR evaluates the quality of the restoration and provides immediate and task-specific feedback, ensuring stable and efficient learning. The AR, therefore, has the advantage of both self-learning and assistive learning by the MR. The proposed model has been extensively evaluated over the benchmark Blind Radar Signal Restoration (BRSR) dataset, which simulates diverse real-world artifact scenarios. Under the fair experimental setup, this study shows that the CoRe-Net surpasses the Op-GANs over a 1 dB mean SNR improvement. To further boost the performance gain, this study proposes multi-pass restoration by cascaded CoRe-Nets trained with a novel paradigm called Progressive Transfer Learning (PTL), which enables iterative refinement, thus achieving an additional 2 dB mean SNR enhancement. Multi-pass CoRe-Net training by PTL consistently yields incremental performance improvements through successive restoration passes whilst highlighting CoRe-Net ability to handle such a complex and varying blend of artifacts.

IVDec 9, 2024
Echocardiography to Cardiac MRI View Transformation for Real-Time Blind Restoration

Ilke Adalioglu, Serkan Kiranyaz, Mete Ahishali et al.

Echocardiography is the most widely used imaging to monitor cardiac functions, serving as the first line in early detection of myocardial ischemia and infarction. However, echocardiography often suffers from several artifacts including sensor noise, lack of contrast, severe saturation, and missing myocardial segments which severely limit its usage in clinical diagnosis. In recent years, several machine learning methods have been proposed to improve echocardiography views. Yet, these methods usually address only a specific problem (e.g. denoising) and thus cannot provide a robust and reliable restoration in general. On the other hand, cardiac MRI provides a clean view of the heart without suffering such severe issues. However, due to its significantly higher cost, it is often only afforded by a few major hospitals, hence hindering its use and accessibility. In this pilot study, we propose a novel approach to transform echocardiography into the cardiac MRI view. For this purpose, Echo2MRI dataset, consisting of echocardiography and real cardiac MRI image pairs, is composed and will be shared publicly. A dedicated Cycle-consistent Generative Adversarial Network (Cycle-GAN) is trained to learn the transformation from echocardiography frames to cardiac MRI views. An extensive set of qualitative evaluations shows that the proposed transformer can synthesize high-quality artifact-free synthetic cardiac MRI views from a given sequence of echocardiography frames. Medical evaluations performed by a group of cardiologists further demonstrate that synthetic MRI views are indistinguishable from their original counterparts and are preferred over their initial sequence of echocardiography frames for diagnosis in 78.9% of the cases.

IVFeb 21, 2022
OSegNet: Operational Segmentation Network for COVID-19 Detection using Chest X-ray Images

Aysen Degerli, Serkan Kiranyaz, Muhammad E. H. Chowdhury et al.

Coronavirus disease 2019 (COVID-19) has been diagnosed automatically using Machine Learning algorithms over chest X-ray (CXR) images. However, most of the earlier studies used Deep Learning models over scarce datasets bearing the risk of overfitting. Additionally, previous studies have revealed the fact that deep networks are not reliable for classification since their decisions may originate from irrelevant areas on the CXRs. Therefore, in this study, we propose Operational Segmentation Network (OSegNet) that performs detection by segmenting COVID-19 pneumonia for a reliable diagnosis. To address the data scarcity encountered in training and especially in evaluation, this study extends the largest COVID-19 CXR dataset: QaTa-COV19 with 121,378 CXRs including 9258 COVID-19 samples with their corresponding ground-truth segmentation masks that are publicly shared with the research community. Consequently, OSegNet has achieved a detection performance with the highest accuracy of 99.65% among the state-of-the-art deep models with 98.09% precision.

SPJan 29, 2022
Blind ECG Restoration by Operational Cycle-GANs

Serkan Kiranyaz, Ozer Can Devecioglu, Turker Ince et al.

Continuous long-term monitoring of electrocardiography (ECG) signals is crucial for the early detection of cardiac abnormalities such as arrhythmia. Non-clinical ECG recordings acquired by Holter and wearable ECG sensors often suffer from severe artifacts such as baseline wander, signal cuts, motion artifacts, variations on QRS amplitude, noise, and other interferences. Usually, a set of such artifacts occur on the same ECG signal with varying severity and duration, and this makes an accurate diagnosis by machines or medical doctors extremely difficult. Despite numerous studies that have attempted ECG denoising, they naturally fail to restore the actual ECG signal corrupted with such artifacts due to their simple and naive noise model. In this study, we propose a novel approach for blind ECG restoration using cycle-consistent generative adversarial networks (Cycle-GANs) where the quality of the signal can be improved to a clinical level ECG regardless of the type and severity of the artifacts corrupting the signal. To further boost the restoration performance, we propose 1D operational Cycle-GANs with the generative neuron model. The proposed approach has been evaluated extensively using one of the largest benchmark ECG datasets from the China Physiological Signal Challenge (CPSC-2020) with more than one million beats. Besides the quantitative and qualitative evaluations, a group of cardiologists performed medical evaluations to validate the quality and usability of the restored ECG, especially for an accurate arrhythmia diagnosis.

LGJan 20, 2022
RamanNet: A generalized neural network architecture for Raman Spectrum Analysis

Nabil Ibtehaz, Muhammad E. H. Chowdhury, Amith Khandakar et al.

Raman spectroscopy provides a vibrational profile of the molecules and thus can be used to uniquely identify different kind of materials. This sort of fingerprinting molecules has thus led to widespread application of Raman spectrum in various fields like medical dignostics, forensics, mineralogy, bacteriology and virology etc. Despite the recent rise in Raman spectra data volume, there has not been any significant effort in developing generalized machine learning methods for Raman spectra analysis. We examine, experiment and evaluate existing methods and conjecture that neither current sequential models nor traditional machine learning models are satisfactorily sufficient to analyze Raman spectra. Both has their perks and pitfalls, therefore we attempt to mix the best of both worlds and propose a novel network architecture RamanNet. RamanNet is immune to invariance property in CNN and at the same time better than traditional machine learning models for the inclusion of sparse connectivity. Our experiments on 4 public datasets demonstrate superior performance over the much complex state-of-the-art methods and thus RamanNet has the potential to become the defacto standard in Raman spectra data analysis

CVNov 29, 2021
Image denoising by Super Neurons: Why go deep?

Junaid Malik, Serkan Kiranyaz, Moncef Gabbouj

Classical image denoising methods utilize the non-local self-similarity principle to effectively recover image content from noisy images. Current state-of-the-art methods use deep convolutional neural networks (CNNs) to effectively learn the mapping from noisy to clean images. Deep denoising CNNs manifest a high learning capacity and integrate non-local information owing to the large receptive field yielded by numerous cascade of hidden layers. However, deep networks are also computationally complex and require large data for training. To address these issues, this study draws the focus on the Self-organized Operational Neural Networks (Self-ONNs) empowered by a novel neuron model that can achieve a similar or better denoising performance with a compact and shallow model. Recently, the concept of super-neurons has been introduced which augment the non-linear transformations of generative neurons by utilizing non-localized kernel locations for an enhanced receptive field size. This is the key accomplishment which renders the need for a deep network configuration. As the integration of non-local information is known to benefit denoising, in this work we investigate the use of super neurons for both synthetic and real-world image denoising. We also discuss the practical issues in implementing the super neuron model on GPUs and propose a trade-off between the heterogeneity of non-localized operations and computational complexity. Our results demonstrate that with the same width and depth, Self-ONNs with super neurons provide a significant boost of denoising performance over the networks with generative and convolutional neurons for both denoising tasks. Moreover, results demonstrate that Self-ONNs with super neurons can achieve a competitive and superior synthetic denoising performances than well-known deep CNN denoisers for synthetic and real-world denoising, respectively.

NINov 27, 2021
ML-based Handover Prediction and AP Selection in Cognitive Wi-Fi Networks

Muhammad Asif Khan, Ridha Hamila, Adel Gastli et al.

Device mobility in dense Wi-Fi networks offers several challenges. Two well-known problems related to device mobility are handover prediction and access point selection. Due to the complex nature of the radio environment, analytical models may not characterize the wireless channel, which makes the solution of these problems very difficult. Recently, cognitive network architectures using sophisticated learning techniques are increasingly being applied to such problems. In this paper, we propose data-driven machine learning (ML) schemes to efficiently solve these problems in wireless LAN (WLAN) networks. The proposed schemes are evaluated and results are compared with traditional approaches to the aforementioned problems. The results report significant improvement in network performance by applying the proposed schemes. The proposed scheme for handover prediction outperforms traditional methods i.e. received signal strength method and traveling distance method by reducing the number of unnecessary handovers by 60% and 50% respectively. Similarly, in AP selection, the proposed scheme outperforms the strongest signal first and least loaded first algorithms by achieving higher throughput gains up to 9.2% and 8% respectively.

SPSep 30, 2021
Robust Peak Detection for Holter ECGs by Self-Organized Operational Neural Networks

Moncef Gabbouj, Serkan Kiranyaz, Junaid Malik et al.

Although numerous R-peak detectors have been proposed in the literature, their robustness and performance levels may significantly deteriorate in low-quality and noisy signals acquired from mobile electrocardiogram (ECG) sensors, such as Holter monitors. Recently, this issue has been addressed by deep 1-D convolutional neural networks (CNNs) that have achieved state-of-the-art performance levels in Holter monitors; however, they pose a high complexity level that requires special parallelized hardware setup for real-time processing. On the other hand, their performance deteriorates when a compact network configuration is used instead. This is an expected outcome as recent studies have demonstrated that the learning performance of CNNs is limited due to their strictly homogenous configuration with the sole linear neuron model. In this study, to further boost the peak detection performance along with an elegant computational efficiency, we propose 1-D Self-Organized ONNs (Self-ONNs) with generative neurons. The most crucial advantage of 1-D Self-ONNs over the ONNs is their self-organization capability that voids the need to search for the best operator set per neuron since each generative neuron has the ability to create the optimal operator during training. The experimental results over the China Physiological Signal Challenge-2020 (CPSC) dataset with more than one million ECG beats show that the proposed 1-D Self-ONNs can significantly surpass the state-of-the-art deep CNN with less computational complexity. Results demonstrate that the proposed solution achieves a 99.10% F1-score, 99.79% sensitivity, and 98.42% positive predictivity in the CPSC dataset, which is the best R-peak detection performance ever achieved.

SPSep 30, 2021
Real-Time Patient-Specific ECG Classification by 1D Self-Operational Neural Networks

Junaid Malik, Ozer Can Devecioglu, Serkan Kiranyaz et al.

Despite the proliferation of numerous deep learning methods proposed for generic ECG classification and arrhythmia detection, compact systems with the real-time ability and high accuracy for classifying patient-specific ECG are still few. Particularly, the scarcity of patient-specific data poses an ultimate challenge to any classifier. Recently, compact 1D Convolutional Neural Networks (CNNs) have achieved the state-of-the-art performance level for the accurate classification of ventricular and supraventricular ectopic beats. However, several studies have demonstrated the fact that the learning performance of the conventional CNNs is limited because they are homogenous networks with a basic (linear) neuron model. In order to address this deficiency and further boost the patient-specific ECG classification performance, in this study, we propose 1D Self-organized Operational Neural Networks (1D Self-ONNs). Due to its self-organization capability, Self-ONNs have the utmost advantage and superiority over conventional ONNs where the prior operator search within the operator set library to find the best possible set of operators is entirely avoided. As the first study where 1D Self-ONNs are ever proposed for a classification task, our results over the MIT-BIH arrhythmia benchmark database demonstrate that 1D Self-ONNs can surpass 1D CNNs with a significant margin while having a similar computational complexity. Under AAMI recommendations and with minimal common training data used, over the entire MIT-BIH dataset 1D Self-ONNs have achieved 98% and 99.04% average accuracies, 76.6% and 93.7% average F1 scores on supra-ventricular and ventricular ectopic beat (VEB) classifications, respectively, which is the highest performance level ever reported.

LGSep 30, 2021
Early Bearing Fault Diagnosis of Rotating Machinery by 1D Self-Organized Operational Neural Networks

Turker Ince, Junaid Malik, Ozer Can Devecioglu et al.

Preventive maintenance of modern electric rotating machinery (RM) is critical for ensuring reliable operation, preventing unpredicted breakdowns and avoiding costly repairs. Recently many studies investigated machine learning monitoring methods especially based on Deep Learning networks focusing mostly on detecting bearing faults; however, none of them addressed bearing fault severity classification for early fault diagnosis with high enough accuracy. 1D Convolutional Neural Networks (CNNs) have indeed achieved good performance for detecting RM bearing faults from raw vibration and current signals but did not classify fault severity. Furthermore, recent studies have demonstrated the limitation in terms of learning capability of conventional CNNs attributed to the basic underlying linear neuron model. Recently, Operational Neural Networks (ONNs) were proposed to enhance the learning capability of CNN by introducing non-linear neuron models and further heterogeneity in the network configuration. In this study, we propose 1D Self-organized ONNs (Self-ONNs) with generative neurons for bearing fault severity classification and providing continuous condition monitoring. Experimental results over the benchmark NSF/IMS bearing vibration dataset using both x- and y-axis vibration signals for inner race and rolling element faults demonstrate that the proposed 1D Self-ONNs achieve significant performance gap against the state-of-the-art (1D CNNs) with similar computational complexity.

IVSep 28, 2021
Real-Time Glaucoma Detection from Digital Fundus Images using Self-ONNs

Ozer Can Devecioglu, Junaid Malik, Turker Ince et al.

Glaucoma leads to permanent vision disability by damaging the optical nerve that transmits visual images to the brain. The fact that glaucoma does not show any symptoms as it progresses and cannot be stopped at the later stages, makes it critical to be diagnosed in its early stages. Although various deep learning models have been applied for detecting glaucoma from digital fundus images, due to the scarcity of labeled data, their generalization performance was limited along with high computational complexity and special hardware requirements. In this study, compact Self-Organized Operational Neural Networks (Self- ONNs) are proposed for early detection of glaucoma in fundus images and their performance is compared against the conventional (deep) Convolutional Neural Networks (CNNs) over three benchmark datasets: ACRIMA, RIM-ONE, and ESOGU. The experimental results demonstrate that Self-ONNs not only achieve superior detection performance but can also significantly reduce the computational complexity making it a potentially suitable network model for biomedical datasets especially when the data is scarce.

SPAug 4, 2021
Generalized Tensor Summation Compressive Sensing Network (GTSNET): An Easy to Learn Compressive Sensing Operation

Mehmet Yamac, Ugur Akpinar, Erdem Sahin et al.

In CS literature, the efforts can be divided into two groups: finding a measurement matrix that preserves the compressed information at the maximum level, and finding a reconstruction algorithm for the compressed information. In the traditional CS setup, the measurement matrices are selected as random matrices, and optimization-based iterative solutions are used to recover the signals. However, when we handle large signals, using random matrices become cumbersome especially when it comes to iterative optimization-based solutions. Even though recent deep learning-based solutions boost the reconstruction accuracy performance while speeding up the recovery, still jointly learning the whole measurement matrix is a difficult process. In this work, we introduce a separable multi-linear learning of the CS matrix by representing it as the summation of arbitrary number of tensors. For a special case where the CS operation is set as a single tensor multiplication, the model is reduced to the learning-based separable CS; while a dense CS matrix can be approximated and learned as the summation of multiple tensors. Both cases can be used in CS of two or multi-dimensional signals e.g., images, multi-spectral images, videos, etc. Structural CS matrices can also be easily approximated and learned in our multi-linear separable learning setup with structural tensor sum representation. Hence, our learnable generalized tensor summation CS operation encapsulates most CS setups including separable CS, non-separable CS (traditional vector-matrix multiplication), structural CS, and CS of the multi-dimensional signals. For both gray-scale and RGB images, the proposed scheme surpasses most state-of-the-art solutions, especially in lower measurement rates. Although the performance gain remains limited from tensor to the sum of tensor representation for gray-scale images, it becomes significant in the RGB case.

CVAug 3, 2021
Super Neurons

Serkan Kiranyaz, Junaid Malik, Mehmet Yamac et al.

Self-Organized Operational Neural Networks (Self-ONNs) have recently been proposed as new-generation neural network models with nonlinear learning units, i.e., the generative neurons that yield an elegant level of diversity; however, like its predecessor, conventional Convolutional Neural Networks (CNNs), they still have a common drawback: localized (fixed) kernel operations. This severely limits the receptive field and information flow between layers and thus brings the necessity for deep and complex models. It is highly desired to improve the receptive field size without increasing the kernel dimensions. This requires a significant upgrade over the generative neurons to achieve the non-localized kernel operations for each connection between consecutive layers. In this article, we present superior (generative) neuron models (or super neurons in short) that allow random or learnable kernel shifts and thus can increase the receptive field size of each connection. The kernel localization process varies among the two super-neuron models. The first model assumes randomly localized kernels within a range and the second one learns (optimizes) the kernel locations during training. An extensive set of comparative evaluations against conventional and deformable convolutional, along with the generative neurons demonstrates that super neurons can empower Self-ONNs to achieve a superior learning and generalization capability with a minimal computational complexity burden.

IVJun 27, 2021
A Machine Learning Model for Early Detection of Diabetic Foot using Thermogram Images

Amith Khandakar, Muhammad E. H. Chowdhury, Mamun Bin Ibne Reaz et al.

Diabetes foot ulceration (DFU) and amputation are a cause of significant morbidity. The prevention of DFU may be achieved by the identification of patients at risk of DFU and the institution of preventative measures through education and offloading. Several studies have reported that thermogram images may help to detect an increase in plantar temperature prior to DFU. However, the distribution of plantar temperature may be heterogeneous, making it difficult to quantify and utilize to predict outcomes. We have compared a machine learning-based scoring technique with feature selection and optimization techniques and learning classifiers to several state-of-the-art Convolutional Neural Networks (CNNs) on foot thermogram images and propose a robust solution to identify the diabetic foot. A comparatively shallow CNN model, MobilenetV2 achieved an F1 score of ~95% for a two-feet thermogram image-based classification and the AdaBoost Classifier used 10 features and achieved an F1 score of 97 %. A comparison of the inference time for the best-performing networks confirmed that the proposed algorithm can be deployed as a smartphone application to allow the user to monitor the progression of the DFU in a home setting.

IVMar 26, 2021
Fully Automated 2D and 3D Convolutional Neural Networks Pipeline for Video Segmentation and Myocardial Infarction Detection in Echocardiography

Oumaima Hamila, Sheela Ramanna, Christopher J. Henry et al.

Cardiac imaging known as echocardiography is a non-invasive tool utilized to produce data including images and videos, which cardiologists use to diagnose cardiac abnormalities in general and myocardial infarction (MI) in particular. Echocardiography machines can deliver abundant amounts of data that need to be quickly analyzed by cardiologists to help them make a diagnosis and treat cardiac conditions. However, the acquired data quality varies depending on the acquisition conditions and the patient's responsiveness to the setup instructions. These constraints are challenging to doctors especially when patients are facing MI and their lives are at stake. In this paper, we propose an innovative real-time end-to-end fully automated model based on convolutional neural networks (CNN) to detect MI depending on regional wall motion abnormalities (RWMA) of the left ventricle (LV) from videos produced by echocardiography. Our model is implemented as a pipeline consisting of a 2D CNN that performs data preprocessing by segmenting the LV chamber from the apical four-chamber (A4C) view, followed by a 3D CNN that performs a binary classification to detect if the segmented echocardiography shows signs of MI. We trained both CNNs on a dataset composed of 165 echocardiography videos each acquired from a distinct patient. The 2D CNN achieved an accuracy of 97.18% on data segmentation while the 3D CNN achieved 90.9% of accuracy, 100% of precision and 95% of recall on MI detection. Our results demonstrate that creating a fully automated system for MI detection is feasible and propitious.

IVMar 14, 2021
COVID-19 Infection Localization and Severity Grading from Chest X-ray Images

Anas M. Tahir, Muhammad E. H. Chowdhury, Amith Khandakar et al.

Coronavirus disease 2019 (COVID-19) has been the main agenda of the whole world, since it came into sight in December 2019 as it has significantly affected the world economy and healthcare system. Given the effects of COVID-19 on pulmonary tissues, chest radiographic imaging has become a necessity for screening and monitoring the disease. Numerous studies have proposed Deep Learning approaches for the automatic diagnosis of COVID-19. Although these methods achieved astonishing performance in detection, they have used limited chest X-ray (CXR) repositories for evaluation, usually with a few hundred COVID-19 CXR images only. Thus, such data scarcity prevents reliable evaluation with the potential of overfitting. In addition, most studies showed no or limited capability in infection localization and severity grading of COVID-19 pneumonia. In this study, we address this urgent need by proposing a systematic and unified approach for lung segmentation and COVID-19 localization with infection quantification from CXR images. To accomplish this, we have constructed the largest benchmark dataset with 33,920 CXR images, including 11,956 COVID-19 samples, where the annotation of ground-truth lung segmentation masks is performed on CXRs by a novel human-machine collaborative approach. An extensive set of experiments was performed using the state-of-the-art segmentation networks, U-Net, U-Net++, and Feature Pyramid Networks (FPN). The developed network, after an extensive iterative process, reached a superior performance for lung region segmentation with Intersection over Union (IoU) of 96.11% and Dice Similarity Coefficient (DSC) of 97.99%. Furthermore, COVID-19 infections of various shapes and types were reliably localized with 83.05% IoU and 88.21% DSC. Finally, the proposed approach has achieved an outstanding COVID-19 detection performance with both sensitivity and specificity values above 99%.

CRMar 10, 2021
Multimodal EEG and Keystroke Dynamics Based Biometric System Using Machine Learning Algorithms

Arafat Rahman, Muhammad E. H. Chowdhury, Amith Khandakar et al.

With the rapid advancement of technology, different biometric user authentication, and identification systems are emerging. Traditional biometric systems like face, fingerprint, and iris recognition, keystroke dynamics, etc. are prone to cyber-attacks and suffer from different disadvantages. Electroencephalography (EEG) based authentication has shown promise in overcoming these limitations. However, EEG-based authentication is less accurate due to signal variability at different psychological and physiological conditions. On the other hand, keystroke dynamics-based identification offers high accuracy but suffers from different spoofing attacks. To overcome these challenges, we propose a novel multimodal biometric system combining EEG and keystroke dynamics. Firstly, a dataset was created by acquiring both keystroke dynamics and EEG signals from 10 users with 500 trials per user at 10 different sessions. Different statistical, time, and frequency domain features were extracted and ranked from the EEG signals and key features were extracted from the keystroke dynamics. Different classifiers were trained, validated, and tested for both individual and combined modalities for two different classification strategies - personalized and generalized. Results show that very high accuracy can be achieved both in generalized and personalized cases for the combination of EEG and keystroke dynamics. The identification and authentication accuracies were found to be 99.80% and 99.68% for Extreme Gradient Boosting (XGBoost) and Random Forest classifiers, respectively which outperform the individual modalities with a significant margin (around 5 percent). We also developed a binary template matching-based algorithm, which gives 93.64% accuracy 6X faster. The proposed method is secured and reliable for any kind of biometric authentication.

CVMar 4, 2021
Convolutional versus Self-Organized Operational Neural Networks for Real-World Blind Image Denoising

Junaid Malik, Serkan Kiranyaz, Mehmet Yamac et al.

Real-world blind denoising poses a unique image restoration challenge due to the non-deterministic nature of the underlying noise distribution. Prevalent discriminative networks trained on synthetic noise models have been shown to generalize poorly to real-world noisy images. While curating real-world noisy images and improving ground truth estimation procedures remain key points of interest, a potential research direction is to explore extensions to the widely used convolutional neuron model to enable better generalization with fewer data and lower network complexity, as opposed to simply using deeper Convolutional Neural Networks (CNNs). Operational Neural Networks (ONNs) and their recent variant, Self-organized ONNs (Self-ONNs), propose to embed enhanced non-linearity into the neuron model and have been shown to outperform CNNs across a variety of regression tasks. However, all such comparisons have been made for compact networks and the efficacy of deploying operational layers as a drop-in replacement for convolutional layers in contemporary deep architectures remains to be seen. In this work, we tackle the real-world blind image denoising problem by employing, for the first time, a deep Self-ONN. Extensive quantitative and qualitative evaluations spanning multiple metrics and four high-resolution real-world noisy image datasets against the state-of-the-art deep CNN network, DnCNN, reveal that deep Self-ONNs consistently achieve superior results with performance gains of up to 1.76dB in PSNR. Furthermore, Self-ONNs with half and even quarter the number of layers that require only a fraction of computational resources as that of DnCNN can still achieve similar or better results compared to the state-of-the-art.

CVMar 4, 2021
BM3D vs 2-Layer ONN

Junaid Malik, Serkan Kiranyaz, Mehmet Yamac et al.

Despite their recent success on image denoising, the need for deep and complex architectures still hinders the practical usage of CNNs. Older but computationally more efficient methods such as BM3D remain a popular choice, especially in resource-constrained scenarios. In this study, we aim to find out whether compact neural networks can learn to produce competitive results as compared to BM3D for AWGN image denoising. To this end, we configure networks with only two hidden layers and employ different neuron models and layer widths for comparing the performance with BM3D across different AWGN noise levels. Our results conclusively show that the recently proposed self-organized variant of operational neural networks based on a generative neuron model (Self-ONNs) is not only a better choice as compared to CNNs, but also provide competitive results as compared to BM3D and even significantly surpass it for high noise levels.