CVJul 12, 2023Code
Operational Support Estimator NetworksMete Ahishali, Mehmet Yamac, Serkan Kiranyaz et al.
In this work, we propose a novel approach called Operational Support Estimator Networks (OSENs) for the support estimation task. Support Estimation (SE) is defined as finding the locations of non-zero elements in sparse signals. By its very nature, the mapping between the measurement and sparse signal is a non-linear operation. Traditional support estimators rely on computationally expensive iterative signal recovery techniques to achieve such non-linearity. Contrary to the convolutional layers, the proposed OSEN approach consists of operational layers that can learn such complex non-linearities without the need for deep networks. In this way, the performance of non-iterative support estimation is greatly improved. Moreover, the operational layers comprise so-called generative super neurons with non-local kernels. The kernel location for each neuron/feature map is optimized jointly for the SE task during training. We evaluate the OSENs in three different applications: i. support estimation from Compressive Sensing (CS) measurements, ii. representation-based classification, and iii. learning-aided CS reconstruction where the output of OSENs is used as prior knowledge to the CS algorithm for enhanced reconstruction. Experimental results show that the proposed approach achieves computational efficiency and outperforms competing methods, especially at low measurement rates by significant margins. The software implementation is shared at https://github.com/meteahishali/OSEN.
IVMar 18, 2021
Advance Warning Methodologies for COVID-19 using Chest X-Ray ImagesMete Ahishali, Aysen Degerli, Mehmet Yamac et al.
Coronavirus disease 2019 (COVID-19) has rapidly become a global health concern after its first known detection in December 2019. As a result, accurate and reliable advance warning system for the early diagnosis of COVID-19 has now become a priority. The detection of COVID-19 in early stages is not a straightforward task from chest X-ray images according to expert medical doctors because the traces of the infection are visible only when the disease has progressed to a moderate or severe stage. In this study, our first aim is to evaluate the ability of recent \textit{state-of-the-art} Machine Learning techniques for the early detection of COVID-19 from chest X-ray images. Both compact classifiers and deep learning approaches are considered in this study. Furthermore, we propose a recent compact classifier, Convolutional Support Estimator Network (CSEN) approach for this purpose since it is well-suited for a scarce-data classification task. Finally, this study introduces a new benchmark dataset called Early-QaTa-COV19, which consists of 1065 early-stage COVID-19 pneumonia samples (very limited or no infection signs) labelled by the medical doctors and 12 544 samples for control (normal) class. A detailed set of experiments shows that the CSEN achieves the top (over 97%) sensitivity with over 95.5% specificity. Moreover, DenseNet-121 network produces the leading performance among other deep networks with 95% sensitivity and 99.74% specificity.
96.2CEMay 25
From Reports to Ontologies: Ontology-Guided Representation Learning for 12-Lead ECGLei Xu, Fahad Sohrab, Mehmet Yamac et al.
The 12-lead electrocardiogram (ECG) is a quasi-periodic, multi-channel signal with diagnostic content spanning timescales from millisecond waveform morphology to multi-second rhythm dynamics. Existing ECG representation learning relies on signal-only self-supervision or ECG-text multimodal alignment, neither of which exploits the structured diagnostic codes attached to every clinical recording. We present \textbf{MAR-ECG}, an ontology-guided masked autoregressive framework that supervises the encoder with a curated 40-node SNOMED-CT cardiac graph through \emph{graph alignment}, eliminating the need for paired clinical reports. MAR-ECG combines two complementary objectives. First, \emph{graph-smoothed contrastive learning} (GSCL) anchors the encoder's rhythm-pooled features to the SNOMED graph, softening supervision targets by ontology distance so that clinically related concepts reinforce one another rather than function as hard negatives. Second, \emph{multi-scale physiological supervision} complements GSCL with signal-derived patch auxiliaries that target rhythm-physiology statistics extracted automatically from the input, extending supervision beyond the patch tier at no annotation cost. Pretrained on ${\sim}40$K publicly available 12-lead ECGs with SNOMED-CT codes and evaluated by frozen linear probing on five downstream classification benchmarks, MAR-ECG consistently outperforms a strong masked-autoregressive baseline, with mean gains in the low-label regime. Despite the absence of paired clinical text, MAR-ECG achieves performance competitive with state-of-the-art multimodal ECG-text methods.
CVJun 1, 2023
Deformable Convolutions and LSTM-based Flexible Event Frame Fusion Network for Motion DeblurringDan Yang, Mehmet Yamac
Event cameras differ from conventional RGB cameras in that they produce asynchronous data sequences. While RGB cameras capture every frame at a fixed rate, event cameras only capture changes in the scene, resulting in sparse and asynchronous data output. Despite the fact that event data carries useful information that can be utilized in motion deblurring of RGB cameras, integrating event and image information remains a challenge. Recent state-of-the-art CNN-based deblurring solutions produce multiple 2-D event frames based on the accumulation of event data over a time period. In most of these techniques, however, the number of event frames is fixed and predefined, which reduces temporal resolution drastically, particularly for scenarios when fast-moving objects are present or when longer exposure times are required. It is also important to note that recent modern cameras (e.g., cameras in mobile phones) dynamically set the exposure time of the image, which presents an additional problem for networks developed for a fixed number of event frames. A Long Short-Term Memory (LSTM)-based event feature extraction module has been developed for addressing these challenges, which enables us to use a dynamically varying number of event frames. Using these modules, we constructed a state-of-the-art deblurring network, Deformable Convolutions and LSTM-based Flexible Event Frame Fusion Network (DLEFNet). It is particularly useful for scenarios in which exposure times vary depending on factors such as lighting conditions or the presence of fast-moving objects in the scene. It has been demonstrated through evaluation results that the proposed method can outperform the existing state-of-the-art networks for deblurring task in synthetic and real-world data sets.
33.4LGMay 19
Axiomatizing Neural Networks via Pursuit of SubspacesMehmet Yamac, Mert Duman, Ugur Akpinar et al.
While deep neural networks have achieved remarkable success across a wide range of domains, their underlying mechanisms remain poorly understood, and they are often regarded as black boxes. This gap between empirical performance and theoretical understanding poses a challenge analogous to the pre-axiomatic stage of classical geometry. In this work, we introduce the Pursuit of Subspaces (PoS) hypothesis, an axiomatic framework that formulates neural network behavior through a set of geometric postulates. These axioms, together with their derived consequences, provide a unified perspective on representation, computation, and generalization in both shallow and deep architectures. We show that this framework yields geometric explanations for fundamental questions in deep learning, including representation structure, architectural mechanisms, and generalization behavior, offering a principled step toward a coherent theoretical foundation.
CVApr 22, 2025Code
Multi-Scale Tensorial Summation and Dimensional Reduction Guided Neural Network for Edge DetectionLei Xu, Mehmet Yamac, Mete Ahishali et al.
Edge detection has attracted considerable attention thanks to its exceptional ability to enhance performance in downstream computer vision tasks. In recent years, various deep learning methods have been explored for edge detection tasks resulting in a significant performance improvement compared to conventional computer vision algorithms. In neural networks, edge detection tasks require considerably large receptive fields to provide satisfactory performance. In a typical convolutional operation, such a large receptive field can be achieved by utilizing a significant number of consecutive layers, which yields deep network structures. Recently, a Multi-scale Tensorial Summation (MTS) factorization operator was presented, which can achieve very large receptive fields even from the initial layers. In this paper, we propose a novel MTS Dimensional Reduction (MTS-DR) module guided neural network, MTS-DR-Net, for the edge detection task. The MTS-DR-Net uses MTS layers, and corresponding MTS-DR blocks as a new backbone to remove redundant information initially. Such a dimensional reduction module enables the neural network to focus specifically on relevant information (i.e., necessary subspaces). Finally, a weight U-shaped refinement module follows MTS-DR blocks in the MTS-DR-Net. We conducted extensive experiments on two benchmark edge detection datasets: BSDS500 and BIPEDv2 to verify the effectiveness of our model. The implementation of the proposed MTS-DR-Net can be found at https://github.com/LeiXuAI/MTS-DR-Net.git.
CVJun 27, 2021Code
Representation Based Regression for Object Distance EstimationMete Ahishali, Mehmet Yamac, Serkan Kiranyaz et al.
In this study, we propose a novel approach to predict the distances of the detected objects in an observed scene. The proposed approach modifies the recently proposed Convolutional Support Estimator Networks (CSENs). CSENs are designed to compute a direct mapping for the Support Estimation (SE) task in a representation-based classification problem. We further propose and demonstrate that representation-based methods (sparse or collaborative representation) can be used in well-designed regression problems. To the best of our knowledge, this is the first representation-based method proposed for performing a regression task by utilizing the modified CSENs; and hence, we name this novel approach as Representation-based Regression (RbR). The initial version of CSENs has a proxy mapping stage (i.e., a coarse estimation for the support set) that is required for the input. In this study, we improve the CSEN model by proposing Compressive Learning CSEN (CL-CSEN) that has the ability to jointly optimize the so-called proxy mapping stage along with convolutional layers. The experimental evaluations using the KITTI 3D Object Detection distance estimation dataset show that the proposed method can achieve a significantly improved distance estimation performance over all competing methods. Finally, the software implementations of the methods are publicly shared at https://github.com/meteahishali/CSENDistance.
CVJul 14, 2025
Expert Operational GANS: Towards Real-Color Underwater Image RestorationOzer Can Devecioglu, Serkan Kiranyaz, Mehmet Yamac et al.
The wide range of deformation artifacts that arise from complex light propagation, scattering, and depth-dependent attenuation makes the underwater image restoration to remain a challenging problem. Like other single deep regressor networks, conventional GAN-based restoration methods struggle to perform well across this heterogeneous domain, since a single generator network is typically insufficient to capture the full range of visual degradations. In order to overcome this limitation, we propose xOp-GAN, a novel GAN model with several expert generator networks, each trained solely on a particular subset with a certain image quality. Thus, each generator can learn to maximize its restoration performance for a particular quality range. Once a xOp-GAN is trained, each generator can restore the input image and the best restored image can then be selected by the discriminator based on its perceptual confidence score. As a result, xOP-GAN is the first GAN model with multiple generators where the discriminator is being used during the inference of the regression task. Experimental results on benchmark Large Scale Underwater Image (LSUI) dataset demonstrates that xOp-GAN achieves PSNR levels up to 25.16 dB, surpassing all single-regressor models by a large margin even, with reduced complexity.
SPAug 4, 2021
Generalized Tensor Summation Compressive Sensing Network (GTSNET): An Easy to Learn Compressive Sensing OperationMehmet Yamac, Ugur Akpinar, Erdem Sahin et al.
In CS literature, the efforts can be divided into two groups: finding a measurement matrix that preserves the compressed information at the maximum level, and finding a reconstruction algorithm for the compressed information. In the traditional CS setup, the measurement matrices are selected as random matrices, and optimization-based iterative solutions are used to recover the signals. However, when we handle large signals, using random matrices become cumbersome especially when it comes to iterative optimization-based solutions. Even though recent deep learning-based solutions boost the reconstruction accuracy performance while speeding up the recovery, still jointly learning the whole measurement matrix is a difficult process. In this work, we introduce a separable multi-linear learning of the CS matrix by representing it as the summation of arbitrary number of tensors. For a special case where the CS operation is set as a single tensor multiplication, the model is reduced to the learning-based separable CS; while a dense CS matrix can be approximated and learned as the summation of multiple tensors. Both cases can be used in CS of two or multi-dimensional signals e.g., images, multi-spectral images, videos, etc. Structural CS matrices can also be easily approximated and learned in our multi-linear separable learning setup with structural tensor sum representation. Hence, our learnable generalized tensor summation CS operation encapsulates most CS setups including separable CS, non-separable CS (traditional vector-matrix multiplication), structural CS, and CS of the multi-dimensional signals. For both gray-scale and RGB images, the proposed scheme surpasses most state-of-the-art solutions, especially in lower measurement rates. Although the performance gain remains limited from tensor to the sum of tensor representation for gray-scale images, it becomes significant in the RGB case.
CVAug 3, 2021
Super NeuronsSerkan Kiranyaz, Junaid Malik, Mehmet Yamac et al.
Self-Organized Operational Neural Networks (Self-ONNs) have recently been proposed as new-generation neural network models with nonlinear learning units, i.e., the generative neurons that yield an elegant level of diversity; however, like its predecessor, conventional Convolutional Neural Networks (CNNs), they still have a common drawback: localized (fixed) kernel operations. This severely limits the receptive field and information flow between layers and thus brings the necessity for deep and complex models. It is highly desired to improve the receptive field size without increasing the kernel dimensions. This requires a significant upgrade over the generative neurons to achieve the non-localized kernel operations for each connection between consecutive layers. In this article, we present superior (generative) neuron models (or super neurons in short) that allow random or learnable kernel shifts and thus can increase the receptive field size of each connection. The kernel localization process varies among the two super-neuron models. The first model assumes randomly localized kernels within a range and the second one learns (optimizes) the kernel locations during training. An extensive set of comparative evaluations against conventional and deformable convolutional, along with the generative neurons demonstrates that super neurons can empower Self-ONNs to achieve a superior learning and generalization capability with a minimal computational complexity burden.
CVMar 4, 2021
Convolutional versus Self-Organized Operational Neural Networks for Real-World Blind Image DenoisingJunaid Malik, Serkan Kiranyaz, Mehmet Yamac et al.
Real-world blind denoising poses a unique image restoration challenge due to the non-deterministic nature of the underlying noise distribution. Prevalent discriminative networks trained on synthetic noise models have been shown to generalize poorly to real-world noisy images. While curating real-world noisy images and improving ground truth estimation procedures remain key points of interest, a potential research direction is to explore extensions to the widely used convolutional neuron model to enable better generalization with fewer data and lower network complexity, as opposed to simply using deeper Convolutional Neural Networks (CNNs). Operational Neural Networks (ONNs) and their recent variant, Self-organized ONNs (Self-ONNs), propose to embed enhanced non-linearity into the neuron model and have been shown to outperform CNNs across a variety of regression tasks. However, all such comparisons have been made for compact networks and the efficacy of deploying operational layers as a drop-in replacement for convolutional layers in contemporary deep architectures remains to be seen. In this work, we tackle the real-world blind image denoising problem by employing, for the first time, a deep Self-ONN. Extensive quantitative and qualitative evaluations spanning multiple metrics and four high-resolution real-world noisy image datasets against the state-of-the-art deep CNN network, DnCNN, reveal that deep Self-ONNs consistently achieve superior results with performance gains of up to 1.76dB in PSNR. Furthermore, Self-ONNs with half and even quarter the number of layers that require only a fraction of computational resources as that of DnCNN can still achieve similar or better results compared to the state-of-the-art.
CVMar 4, 2021
BM3D vs 2-Layer ONNJunaid Malik, Serkan Kiranyaz, Mehmet Yamac et al.
Despite their recent success on image denoising, the need for deep and complex architectures still hinders the practical usage of CNNs. Older but computationally more efficient methods such as BM3D remain a popular choice, especially in resource-constrained scenarios. In this study, we aim to find out whether compact neural networks can learn to produce competitive results as compared to BM3D for AWGN image denoising. To this end, we configure networks with only two hidden layers and employ different neuron models and layer widths for comparing the performance with BM3D across different AWGN noise levels. Our results conclusively show that the recently proposed self-organized variant of operational neural networks based on a generative neuron model (Self-ONNs) is not only a better choice as compared to CNNs, but also provide competitive results as compared to BM3D and even significantly surpass it for high noise levels.
IVSep 26, 2020
COVID-19 Infection Map Generation and Detection from Chest X-Ray ImagesAysen Degerli, Mete Ahishali, Mehmet Yamac et al.
Computer-aided diagnosis has become a necessity for accurate and immediate coronavirus disease 2019 (COVID-19) detection to aid treatment and prevent the spread of the virus. Numerous studies have proposed to use Deep Learning techniques for COVID-19 diagnosis. However, they have used very limited chest X-ray (CXR) image repositories for evaluation with a small number, a few hundreds, of COVID-19 samples. Moreover, these methods can neither localize nor grade the severity of COVID-19 infection. For this purpose, recent studies proposed to explore the activation maps of deep networks. However, they remain inaccurate for localizing the actual infestation making them unreliable for clinical use. This study proposes a novel method for the joint localization, severity grading, and detection of COVID-19 from CXR images by generating the so-called infection maps. To accomplish this, we have compiled the largest dataset with 119,316 CXR images including 2951 COVID-19 samples, where the annotation of the ground-truth segmentation masks is performed on CXRs by a novel collaborative human-machine approach. Furthermore, we publicly release the first CXR dataset with the ground-truth segmentation masks of the COVID-19 infected regions. A detailed set of experiments show that state-of-the-art segmentation networks can learn to localize COVID-19 infection with an F1-score of 83.20%, which is significantly superior to the activation maps created by the previous methods. Finally, the proposed approach achieved a COVID-19 detection performance with 94.96% sensitivity and 99.88% specificity.
IVJun 7, 2020
Advance Warning Methodologies for COVID-19 using Chest X-Ray ImagesMete Ahishali, Aysen Degerli, Mehmet Yamac et al.
Coronavirus disease 2019 (COVID-19) has rapidly become a global health concern after its first known detection in December 2019. As a result, accurate and reliable advance warning system for the early diagnosis of COVID-19 has now become a priority. The detection of COVID-19 in early stages is not a straightforward task from chest X-ray images according to expert medical doctors because the traces of the infection are visible only when the disease has progressed to a moderate or severe stage. In this study, our first aim is to evaluate the ability of recent \textit{state-of-the-art} Machine Learning techniques for the early detection of COVID-19 from chest X-ray images. Both compact classifiers and deep learning approaches are considered in this study. Furthermore, we propose a recent compact classifier, Convolutional Support Estimator Network (CSEN) approach for this purpose since it is well-suited for a scarce-data classification task. Finally, this study introduces a new benchmark dataset called Early-QaTa-COV19, which consists of 1065 early-stage COVID-19 pneumonia samples (very limited or no infection signs) labelled by the medical doctors and 12 544 samples for control (normal) class. A detailed set of experiments shows that the CSEN achieves the top (over 97%) sensitivity with over 95.5% specificity. Moreover, DenseNet-121 network produces the leading performance among other deep networks with 95% sensitivity and 99.74% specificity.
IVMay 8, 2020
Convolutional Sparse Support Estimator Based Covid-19 Recognition from X-ray ImagesMehmet Yamac, Mete Ahishali, Aysen Degerli et al.
Coronavirus disease (Covid-19) has been the main agenda of the whole world since it came in sight in December 2019. It has already caused thousands of causalities and infected several millions worldwide. Any technological tool that can be provided to healthcare practitioners to save time, effort, and possibly lives has crucial importance. The main tools practitioners currently use to diagnose Covid-19 are Reverse Transcription-Polymerase Chain reaction (RT-PCR) and Computed Tomography (CT), which require significant time, resources and acknowledged experts. X-ray imaging is a common and easily accessible tool that has great potential for Covid-19 diagnosis. In this study, we propose a novel approach for Covid-19 recognition from chest X-ray images. Despite the importance of the problem, recent studies in this domain produced not so satisfactory results due to the limited datasets available for training. Recall that Deep Learning techniques can generally provide state-of-the-art performance in many classification tasks when trained properly over large datasets, such data scarcity can be a crucial obstacle when using them for Covid-19 detection. Alternative approaches such as representation-based classification (collaborative or sparse representation) might provide satisfactory performance with limited size datasets, but they generally fall short in performance or speed compared to Machine Learning methods. To address this deficiency, Convolution Support Estimation Network (CSEN) has recently been proposed as a bridge between model-based and Deep Learning approaches by providing a non-iterative real-time mapping from query sample to ideally sparse representation coefficient' support, which is critical information for class decision in representation based techniques.
SPMar 2, 2020
Convolutional Sparse Support Estimator Network (CSEN) From energy efficient support estimation to learning-aided Compressive SensingMehmet Yamac, Mete Ahishali, Serkan Kiranyaz et al.
Support estimation (SE) of a sparse signal refers to finding the location indices of the non-zero elements in a sparse representation. Most of the traditional approaches dealing with SE problem are iterative algorithms based on greedy methods or optimization techniques. Indeed, a vast majority of them use sparse signal recovery techniques to obtain support sets instead of directly mapping the non-zero locations from denser measurements (e.g., Compressively Sensed Measurements). This study proposes a novel approach for learning such a mapping from a training set. To accomplish this objective, the Convolutional Support Estimator Networks (CSENs), each with a compact configuration, are designed. The proposed CSEN can be a crucial tool for the following scenarios: (i) Real-time and low-cost support estimation can be applied in any mobile and low-power edge device for anomaly localization, simultaneous face recognition, etc. (ii) CSEN's output can directly be used as "prior information" which improves the performance of sparse signal recovery algorithms. The results over the benchmark datasets show that state-of-the-art performance levels can be achieved by the proposed approach with a significantly reduced computational complexity.
CRJun 20, 2019
Reversible Privacy Preservation using Multi-level Encryption and Compressive SensingMehmet Yamac, Mete Ahishali, Nikolaos Passalis et al.
Security monitoring via ubiquitous cameras and their more extended in intelligent buildings stand to gain from advances in signal processing and machine learning. While these innovative and ground-breaking applications can be considered as a boon, at the same time they raise significant privacy concerns. In fact, recent GDPR (General Data Protection Regulation) legislation has highlighted and become an incentive for privacy-preserving solutions. Typical privacy-preserving video monitoring schemes address these concerns by either anonymizing the sensitive data. However, these approaches suffer from some limitations, since they are usually non-reversible, do not provide multiple levels of decryption and computationally costly. In this paper, we provide a novel privacy-preserving method, which is reversible, supports de-identification at multiple privacy levels, and can efficiently perform data acquisition, encryption and data hiding by combining multi-level encryption with compressive sensing. The effectiveness of the proposed approach in protecting the identity of the users has been validated using the goodness of reconstruction quality and strong anonymization of the faces.
CVMay 17, 2019
Multilinear Compressive LearningDat Thanh Tran, Mehmet Yamac, Aysen Degerli et al.
Compressive Learning is an emerging topic that combines signal acquisition via compressive sensing and machine learning to perform inference tasks directly on a small number of measurements. Many data modalities naturally have a multi-dimensional or tensorial format, with each dimension or tensor mode representing different features such as the spatial and temporal information in video sequences or the spatial and spectral information in hyperspectral images. However, in existing compressive learning frameworks, the compressive sensing component utilizes either random or learned linear projection on the vectorized signal to perform signal acquisition, thus discarding the multi-dimensional structure of the signals. In this paper, we propose Multilinear Compressive Learning, a framework that takes into account the tensorial nature of multi-dimensional signals in the acquisition step and builds the subsequent inference model on the structurally sensed measurements. Our theoretical complexity analysis shows that the proposed framework is more efficient compared to its vector-based counterpart in both memory and computation requirement. With extensive experiments, we also empirically show that our Multilinear Compressive Learning framework outperforms the vector-based framework in object classification and face recognition tasks, and scales favorably when the dimensionalities of the original signals increase, making it highly efficient for high-dimensional multi-dimensional signals.
LGOct 15, 2018
Compressively Sensed Image RecognitionAysen Degerli, Sinem Aslan, Mehmet Yamac et al.
Compressive Sensing (CS) theory asserts that sparse signal reconstruction is possible from a small number of linear measurements. Although CS enables low-cost linear sampling, it requires non-linear and costly reconstruction. Recent literature works show that compressive image classification is possible in CS domain without reconstruction of the signal. In this work, we introduce a DCT base method that extracts binary discriminative features directly from CS measurements. These CS measurements can be obtained by using (i) a random or a pseudo-random measurement matrix, or (ii) a measurement matrix whose elements are learned from the training data to optimize the given classification task. We further introduce feature fusion by concatenating Bag of Words (BoW) representation of our binary features with one of the two state-of-the-art CNN-based feature vectors. We show that our fused feature outperforms the state-of-the-art in both cases.