IVDec 19, 2022Code
Focal-UNet: UNet-like Focal Modulation for Medical Image SegmentationMohammadReza Naderi, MohammadHossein Givkashi, Fatemeh Piri et al.
Recently, many attempts have been made to construct a transformer base U-shaped architecture, and new methods have been proposed that outperformed CNN-based rivals. However, serious problems such as blockiness and cropped edges in predicted masks remain because of transformers' patch partitioning operations. In this work, we propose a new U-shaped architecture for medical image segmentation with the help of the newly introduced focal modulation mechanism. The proposed architecture has asymmetric depths for the encoder and decoder. Due to the ability of the focal module to aggregate local and global features, our model could simultaneously benefit the wide receptive field of transformers and local viewing of CNNs. This helps the proposed method balance the local and global feature usage to outperform one of the most powerful transformer-based U-shaped models called Swin-UNet. We achieved a 1.68% higher DICE score and a 0.89 better HD metric on the Synapse dataset. Also, with extremely limited data, we had a 4.25% higher DICE score on the NeoPolyp dataset. Our implementations are available at: https://github.com/givkashi/Focal-UNet
IVMar 12, 2023
Endoscopy Classification Model Using Swin Transformer and Saliency MapZahra Sobhaninia, Nasrin Abharian, Nader Karimi et al.
Endoscopy is a valuable tool for the early diagnosis of colon cancer. However, it requires the expertise of endoscopists and is a time-consuming process. In this work, we propose a new multi-label classification method, which considers two aspects of learning approaches (local and global views) for endoscopic image classification. The model consists of a Swin transformer branch and a modified VGG16 model as a CNN branch. To help the learning process of the CNN branch, the model employs saliency maps and endoscopy images and concatenates them. The results demonstrate that this method performed well for endoscopic medical images by utilizing local and global features of the images. Furthermore, quantitative evaluations prove the proposed method's superiority over state-of-the-art works.
CVJun 12, 2023
Supervised Deep Learning for Content-Aware Image Retargeting with Fourier ConvolutionsMohammadHossein Givkashi, MohammadReza Naderi, Nader Karimi et al.
Image retargeting aims to alter the size of the image with attention to the contents. One of the main obstacles to training deep learning models for image retargeting is the need for a vast labeled dataset. Labeled datasets are unavailable for training deep learning models in the image retargeting tasks. As a result, we present a new supervised approach for training deep learning models. We use the original images as ground truth and create inputs for the model by resizing and cropping the original images. A second challenge is generating different image sizes in inference time. However, regular convolutional neural networks cannot generate images of different sizes than the input image. To address this issue, we introduced a new method for supervised learning. In our approach, a mask is generated to show the desired size and location of the object. Then the mask and the input image are fed to the network. Comparing image retargeting methods and our proposed method demonstrates the model's ability to produce high-quality retargeted images. Afterward, we compute the image quality assessment score for each output image based on different techniques and illustrate the effectiveness of our approach.
CVJan 9, 2023
SFI-Swin: Symmetric Face Inpainting with Swin Transformer by Distinctly Learning Face Components DistributionsMohammadReza Naderi, MohammadHossein Givkashi, Nader Karimi et al.
Image inpainting consists of filling holes or missing parts of an image. Inpainting face images with symmetric characteristics is more challenging than inpainting a natural scene. None of the powerful existing models can fill out the missing parts of an image while considering the symmetry and homogeneity of the picture. Moreover, the metrics that assess a repaired face image quality cannot measure the preservation of symmetry between the rebuilt and existing parts of a face. In this paper, we intend to solve the symmetry problem in the face inpainting task by using multiple discriminators that check each face organ's reality separately and a transformer-based network. We also propose "symmetry concentration score" as a new metric for measuring the symmetry of a repaired face image. The quantitative and qualitative results show the superiority of our proposed method compared to some of the recently proposed algorithms in terms of the reality, symmetry, and homogeneity of the inpainted parts.
CVDec 25, 2022
Adaptive Blind Watermarking Using Psychovisual Image FeaturesArezoo PariZanganeh, Ghazaleh Ghorbanzadeh, Zahra Nabizadeh ShahreBabak et al.
With the growth of editing and sharing images through the internet, the importance of protecting the images' authorship has increased. Robust watermarking is a known approach to maintaining copyright protection. Robustness and imperceptibility are two factors that are tried to be maximized through watermarking. Usually, there is a trade-off between these two parameters. Increasing the robustness would lessen the imperceptibility of the watermarking. This paper proposes an adaptive method that determines the strength of the watermark embedding in different parts of the cover image regarding its texture and brightness. Adaptive embedding increases the robustness while preserving the quality of the watermarked image. Experimental results also show that the proposed method can effectively reconstruct the embedded payload in different kinds of common watermarking attacks. Our proposed method has shown good performance compared to a recent technique.
CVSep 11, 2022
OAIR: Object-Aware Image Retargeting Using PSO and Aesthetic Quality AssessmentMohammad Reza Naderi, Mohammad Hossein Givkashi, Nader Karimi et al.
Image retargeting aims at altering an image size while preserving important content and minimizing noticeable distortions. However, previous image retargeting methods create outputs that suffer from artifacts and distortions. Besides, most previous works attempt to retarget the background and foreground of the input image simultaneously. Simultaneous resizing of the foreground and background causes changes in the aspect ratios of the objects. The change in the aspect ratio is specifically not desirable for human objects. We propose a retargeting method that overcomes these problems. The proposed approach consists of the following steps. Firstly, an inpainting method uses the input image and the binary mask of foreground objects to produce a background image without any foreground objects. Secondly, the seam carving method resizes the background image to the target size. Then, a super-resolution method increases the input image quality, and we then extract the foreground objects. Finally, the retargeted background and the extracted super-resolued objects are fed into a particle swarm optimization algorithm (PSO). The PSO algorithm uses aesthetic quality assessment as its objective function to identify the best location and size for the objects to be placed in the background. We used image quality assessment and aesthetic quality assessment measures to show our superior results compared to popular image retargeting techniques.
CVNov 15, 2022
Dynamic-Pix2Pix: Noise Injected cGAN for Modeling Input and Target Domain Joint Distributions with Limited Training DataMohammadreza Naderi, Nader Karimi, Ali Emami et al.
Learning to translate images from a source to a target domain with applications such as converting simple line drawing to oil painting has attracted significant attention. The quality of translated images is directly related to two crucial issues. First, the consistency of the output distribution with that of the target is essential. Second, the generated output should have a high correlation with the input. Conditional Generative Adversarial Networks, cGANs, are the most common models for translating images. The performance of a cGAN drops when we use a limited training dataset. In this work, we increase the Pix2Pix (a form of cGAN) target distribution modeling ability with the help of dynamic neural network theory. Our model has two learning cycles. The model learns the correlation between input and ground truth in the first cycle. Then, the model's architecture is refined in the second cycle to learn the target distribution from noise input. These processes are executed in each iteration of the training procedure. Helping the cGAN learn the target distribution from noise input results in a better model generalization during the test time and allows the model to fit almost perfectly to the target domain distribution. As a result, our model surpasses the Pix2Pix model in segmenting HC18 and Montgomery's chest x-ray images. Both qualitative and Dice scores show the superiority of our model. Although our proposed method does not use thousand of additional data for pretraining, it produces comparable results for the in and out-domain generalization compared to the state-of-the-art methods.
IVMar 17, 2024
A lightweight deep learning pipeline with DRDA-Net and MobileNet for breast cancer classificationMahdie Ahmadi, Nader Karimi, Shadrokh Samavi
Accurate and early detection of breast cancer is essential for successful treatment. This paper introduces a novel deep-learning approach for improved breast cancer classification in histopathological images, a crucial step in diagnosis. Our method hinges on the Dense Residual Dual-Shuffle Attention Network (DRDA-Net), inspired by ShuffleNet's efficient architecture. DRDA-Net achieves exceptional accuracy across various magnification levels on the BreaKHis dataset, a breast cancer histopathology analysis benchmark. However, for real-world deployment, computational efficiency is paramount. We integrate a pre-trained MobileNet model renowned for its lightweight design to address computational. MobileNet ensures fast execution even on devices with limited resources without sacrificing performance. This combined approach offers a promising solution for accurate breast cancer diagnosis, paving the way for faster and more accessible screening procedures.
MMDec 8, 2023
High-Quality Live Video Streaming via Transcoding Time Prediction and Preset SelectionZahra Nabizadeh Shahre-Babak, Nader Karimi, Krishna Rapaka et al.
Video streaming often requires transcoding content into different resolutions and bitrates to match the recipient's internet speed and screen capabilities. Video encoders like x264 offer various presets, each with different tradeoffs between transcoding time and rate-distortion performance. Choosing the best preset for video transcoding is difficult, especially for live streaming, as trying all the presets and choosing the best one is not feasible. One solution is to predict each preset's transcoding time and select the preset that ensures the highest quality while adhering to live streaming time constraints. Prediction of video transcoding time is also critical in minimizing streaming delays, deploying resource management algorithms, and load balancing. We propose a learning-based framework for predicting the transcoding time of videos across various presets. Our predictor's features for video transcoding time prediction are derived directly from the ingested stream, primarily from the header or metadata. As a result, only minimal additional delay is incurred for feature extraction, rendering our approach ideal for live-streaming applications. We evaluated our learning-based transcoding time prediction using a dataset of videos. The results demonstrate that our framework can accurately predict the transcoding time for different presets, with a mean absolute percentage error (MAPE) of nearly 5.0%. Leveraging these predictions, we then select the most suitable transcoding preset for live video streaming. Utilizing our transcoding time prediction-based preset selection improved Peak Signal-to-Noise Ratio (PSNR) of up to 5 dB.
IVNov 22, 2024
BrightVAE: Luminosity Enhancement in Underexposed Endoscopic ImagesFarzaneh Koohestani, Zahra Nabizadeh, Nader Karimi et al.
The enhancement of image luminosity is especially critical in endoscopic images. Underexposed endoscopic images often suffer from reduced contrast and uneven brightness, significantly impacting diagnostic accuracy and treatment planning. Internal body imaging is challenging due to uneven lighting and shadowy regions. Enhancing such images is essential since precise image interpretation is crucial for patient outcomes. In this paper, we introduce BrightVAE, an architecture based on the hierarchical Vector Quantized Variational Autoencoder (hierarchical VQ-VAE) tailored explicitly for enhancing luminosity in low-light endoscopic images. Our architecture is meticulously designed to tackle the unique challenges inherent in endoscopic imaging, such as significant variations in illumination and obscured details due to poor lighting conditions. The proposed model emphasizes advanced feature extraction from three distinct viewpoints-incorporating various receptive fields, skip connections, and feature attentions to robustly enhance image quality and support more accurate medical diagnoses. Through rigorous experimental analysis, we demonstrate the effectiveness of these techniques in enhancing low-light endoscopic images. To evaluate the performance of our architecture, we employ three widely recognized metrics-SSIM, PSNR, and LPIPS-specifically on Endo4IE dataset, which consists of endoscopic images. We evaluated our method using the Endo4IE dataset, which consists exclusively of endoscopic images, and showed significant advancements over the state-of-the-art methods for enhancing luminosity in endoscopic imaging.
MMApr 13, 2024
A Parametric Rate-Distortion Model for Video TranscodingMaedeh Jamali, Nader Karimi, Shadrokh Samavi et al.
Over the past two decades, the surge in video streaming applications has been fueled by the increasing accessibility of the internet and the growing demand for network video. As users with varying internet speeds and devices seek high-quality video, transcoding becomes essential for service providers. In this paper, we introduce a parametric rate-distortion (R-D) transcoding model. Our model excels at predicting transcoding distortion at various rates without the need for encoding the video. This model serves as a versatile tool that can be used to achieve visual quality improvement (in terms of PSNR) via trans-sizing. Moreover, we use our model to identify visually lossless and near-zero-slope bitrate ranges for an ingest video. Having this information allows us to adjust the transcoding target bitrate while introducing visually negligible quality degradations. By utilizing our model in this manner, quality improvements up to 2 dB and bitrate savings of up to 46% of the original target bitrate are possible. Experimental results demonstrate the efficacy of our model in video transcoding rate distortion prediction.
CVDec 23, 2023
Revealing Shadows: Low-Light Image Enhancement Using Self-Calibrated IlluminationFarzaneh Koohestani, Nader Karimi, Shadrokh Samavi
In digital imaging, enhancing visual content in poorly lit environments is a significant challenge, as images often suffer from inadequate brightness, hidden details, and an overall reduction in quality. This issue is especially critical in applications like nighttime surveillance, astrophotography, and low-light videography, where clear and detailed visual information is crucial. Our research addresses this problem by enhancing the illumination aspect of dark images. We have advanced past techniques by using varied color spaces to extract the illumination component, enhance it, and then recombine it with the other components of the image. By employing the Self-Calibrated Illumination (SCI) method, a strategy initially developed for RGB images, we effectively intensify and clarify details that are typically lost in low-light conditions. This method of selective illumination enhancement leaves the color information intact, thus preserving the color integrity of the image. Crucially, our method eliminates the need for paired images, making it suitable for situations where they are unavailable. Implementing the modified SCI technique represents a substantial shift from traditional methods, providing a refined and potent solution for low-light image enhancement. Our approach sets the stage for more complex image processing techniques and extends the range of possible real-world applications where accurate color representation and improved visibility are essential.
CVFeb 21, 2022
DGAFF: Deep Genetic Algorithm Fitness Formation for EEG Bio-Signal Channel SelectionGhazaleh Ghorbanzadeh, Zahra Nabizadeh, Nader Karimi et al.
Brain-computer interface systems aim to facilitate human-computer interactions in a great deal by direct translation of brain signals for computers. Recently, using many electrodes has caused better performance in these systems. However, increasing the number of recorded electrodes leads to additional time, hardware, and computational costs besides undesired complications of the recording process. Channel selection has been utilized to decrease data dimension and eliminate irrelevant channels while reducing the noise effects. Furthermore, the technique lowers the time and computational costs in real-time applications. We present a channel selection method, which combines a sequential search method with a genetic algorithm called Deep GA Fitness Formation (DGAFF). The proposed method accelerates the convergence of the genetic algorithm and increases the system's performance. The system evaluation is based on a lightweight deep neural network that automates the whole model training process. The proposed method outperforms other channel selection methods in classifying motor imagery on the utilized dataset.
IVDec 28, 2021
Brain Tumor Classification by Cascaded Multiscale Multitask Learning Framework Based on Feature AggregationZahra Sobhaninia, Nader Karimi, Pejman Khadivi et al.
Brain tumor analysis in MRI images is a significant and challenging issue because misdiagnosis can lead to death. Diagnosis and evaluation of brain tumors in the early stages increase the probability of successful treatment. However, the complexity and variety of tumors, shapes, and locations make their segmentation and classification complex. In this regard, numerous researchers have proposed brain tumor segmentation and classification methods. This paper presents an approach that simultaneously segments and classifies brain tumors in MRI images using a framework that contains MRI image enhancement and tumor region detection. Eventually, a network based on a multitask learning approach is proposed. Subjective and objective results indicate that the segmentation and classification results based on evaluation metrics are better or comparable to the state-of-the-art.
CVDec 17, 2021
Image Inpainting Using AutoEncoder and Guided Selection of Predicted PixelsMohammad H. Givkashi, Mahshid Hadipour, Arezoo PariZanganeh et al.
Image inpainting is an effective method to enhance distorted digital images. Different inpainting methods use the information of neighboring pixels to predict the value of missing pixels. Recently deep neural networks have been used to learn structural and semantic details of images for inpainting purposes. In this paper, we propose a network for image inpainting. This network, similar to U-Net, extracts various features from images, leading to better results. We improved the final results by replacing the damaged pixels with the recovered pixels of the output images. Our experimental results show that this method produces high-quality results compare to the traditional methods.
IVDec 7, 2021
Nuclei Segmentation in Histopathology Images using Deep Learning with Local and Global ViewsMahdi Arab Loodaricheh, Nader Karimi, Shadrokh Samavi
Digital pathology is one of the most significant developments in modern medicine. Pathological examinations are the gold standard of medical protocols and play a fundamental role in diagnosis. Recently, with the advent of digital scanners, tissue histopathology slides can now be digitized and stored as digital images. As a result, digitized histopathological tissues can be used in computer-aided image analysis programs and machine learning techniques. Detection and segmentation of nuclei are some of the essential steps in the diagnosis of cancers. Recently, deep learning has been used for nuclei segmentation. However, one of the problems in deep learning methods for nuclei segmentation is the lack of information from out of the patches. This paper proposes a deep learning-based approach for nuclei segmentation, which addresses the problem of misprediction in patch border areas. We use both local and global patches to predict the final segmentation map. Experimental results on the Multi-organ histopathology dataset demonstrate that our method outperforms the baseline nuclei segmentation and popular segmentation models.
CVSep 12, 2021
MSGDD-cGAN: Multi-Scale Gradients Dual Discriminator Conditional Generative Adversarial NetworkMohammadreza Naderi, Zahra Nabizadeh, Nader Karimi et al.
Conditional Generative Adversarial Networks (cGANs) have been used in many image processing tasks. However, they still have serious problems maintaining the balance between conditioning the output on the input and creating the output with the desired distribution based on the corresponding ground truth. The traditional cGANs, similar to most conventional GANs, suffer from vanishing gradients, which backpropagate from the discriminator to the generator. Moreover, the traditional cGANs are sensitive to architectural changes due to previously mentioned gradient problems. Therefore, balancing the architecture of the cGANs is almost impossible. Recently MSG-GAN has been proposed to stabilize the performance of the GANs by applying multiple connections between the generator and discriminator. In this work, we propose a method called MSGDD-cGAN, which first stabilizes the performance of the cGANs using multi-connections gradients flow. Secondly, the proposed network architecture balances the correlation of the output to input and the fitness of the output on the target distribution. This balance is generated by using the proposed dual discrimination procedure. We tested our model by segmentation of fetal ultrasound images. Our model shows a 3.18% increase in the F1 score comparing to the pix2pix version of cGANs.
IVAug 19, 2021
Segmentation of Lungs COVID Infected Regions by Attention Mechanism and Synthetic DataParham Yazdekhasty, Ali Zindari, Zahra Nabizadeh-ShahreBabak et al.
Coronavirus has caused hundreds of thousands of deaths. Fatalities could decrease if every patient could get suitable treatment by the healthcare system. Machine learning, especially computer vision methods based on deep learning, can help healthcare professionals diagnose and treat COVID-19 infected cases more efficiently. Hence, infected patients can get better service from the healthcare system and decrease the number of deaths caused by the coronavirus. This research proposes a method for segmenting infected lung regions in a CT image. For this purpose, a convolutional neural network with an attention mechanism is used to detect infected areas with complex patterns. Attention blocks improve the segmentation accuracy by focusing on informative parts of the image. Furthermore, a generative adversarial network generates synthetic images for data augmentation and expansion of small available datasets. Experimental results show the superiority of the proposed method compared to some existing procedures.
IVJan 21, 2021
Weighted Fuzzy-Based PSNR for WatermarkingMaedeh Jamali, Nader Karimi, Shadrokh Samavi
One of the problems of conventional visual quality evaluation criteria such as PSNR and MSE is the lack of appropriate standards based on the human visual system (HVS). They are calculated based on the difference of the corresponding pixels in the original and manipulated image. Hence, they practically do not provide a correct understanding of the image quality. Watermarking is an image processing application in which the image's visual quality is an essential criterion for its evaluation. Watermarking requires a criterion based on the HVS that provides more accurate values than conventional measures such as PSNR. This paper proposes a weighted fuzzy-based criterion that tries to find essential parts of an image based on the HVS. Then these parts will have larger weights in computing the final value of PSNR. We compare our results against standard PSNR, and our experiments show considerable consequences.
IVNov 1, 2020
Bifurcated Autoencoder for Segmentation of COVID-19 Infected Regions in CT ImagesParham Yazdekhasty, Ali Zindar, Zahra Nabizadeh-ShahreBabak et al.
The new coronavirus infection has shocked the world since early 2020 with its aggressive outbreak. Rapid detection of the disease saves lives, and relying on medical imaging (Computed Tomography and X-ray) to detect infected lungs has shown to be effective. Deep learning and convolutional neural networks have been used for image analysis in this context. However, accurate identification of infected regions has proven challenging for two main reasons. Firstly, the characteristics of infected areas differ in different images. Secondly, insufficient training data makes it challenging to train various machine learning algorithms, including deep-learning models. This paper proposes an approach to segment lung regions infected by COVID-19 to help cardiologists diagnose the disease more accurately, faster, and more manageable. We propose a bifurcated 2-D model for two types of segmentation. This model uses a shared encoder and a bifurcated connection to two separate decoders. One decoder is for segmentation of the healthy region of the lungs, while the other is for the segmentation of the infected regions. Experiments on publically available images show that the bifurcated structure segments infected regions of the lungs better than state of the art.
IVNov 1, 2020
Brain Tumor Classification Using Medial Residual Encoder LayersZahra SobhaniNia, Nader Karimi, Pejman Khadivi et al.
According to the World Health Organization (WHO), cancer is the second leading cause of death worldwide, responsible for over 9.5 million deaths in 2018 alone. Brain tumors count for one out of every four cancer deaths. Therefore, accurate and timely diagnosis of brain tumors will lead to more effective treatments. Physicians classify brain tumors only with biopsy operation by brain surgery, and after diagnosing the type of tumor, a treatment plan is considered for the patient. Automatic systems based on machine learning algorithms can allow physicians to diagnose brain tumors with noninvasive measures. To date, several image classification approaches have been proposed to aid diagnosis and treatment. For brain tumor classification in this work, we offer a system based on deep learning, containing encoder blocks. These blocks are fed with post-max-pooling features as residual learning. Our approach shows promising results by improving the tumor classification accuracy in Magnetic resonance imaging (MRI) images using a limited medical image dataset. Experimental evaluations of this model on a dataset consisting of 3064 MR images show 95.98% accuracy, which is better than previous studies on this database.
IVSep 1, 2020
Classification of Diabetic Retinopathy Using Unlabeled Data and Knowledge DistillationSajjad Abbasi, Mohsen Hajabdollahi, Pejman Khadivi et al.
Knowledge distillation allows transferring knowledge from a pre-trained model to another. However, it suffers from limitations, and constraints related to the two models need to be architecturally similar. Knowledge distillation addresses some of the shortcomings associated with transfer learning by generalizing a complex model to a lighter model. However, some parts of the knowledge may not be distilled by knowledge distillation sufficiently. In this paper, a novel knowledge distillation approach using transfer learning is proposed. The proposed method transfers the entire knowledge of a model to a new smaller one. To accomplish this, unlabeled data are used in an unsupervised manner to transfer the maximum amount of knowledge to the new slimmer model. The proposed method can be beneficial in medical image analysis, where labeled data are typically scarce. The proposed approach is evaluated in the context of classification of images for diagnosing Diabetic Retinopathy on two publicly available datasets, including Messidor and EyePACS. Simulation results demonstrate that the approach is effective in transferring knowledge from a complex model to a lighter one. Furthermore, experimental results illustrate that the performance of different small models is improved significantly using unlabeled data and knowledge distillation.
SPJul 24, 2020
Selection of Proper EEG Channels for Subject Intention Classification Using Deep LearningGhazale Ghorbanzade, Zahra Nabizadeh-ShahreBabak, Shadrokh Samavi et al.
Brain signals could be used to control devices to assist individuals with disabilities. Signals such as electroencephalograms are complicated and hard to interpret. A set of signals are collected and should be classified to identify the intention of the subject. Different approaches have tried to reduce the number of channels before sending them to a classifier. We are proposing a deep learning-based method for selecting an informative subset of channels that produce high classification accuracy. The proposed network could be trained for an individual subject for the selection of an appropriate set of channels. Reduction of the number of channels could reduce the complexity of brain-computer-interface devices. Our method could find a subset of channels. The accuracy of our approach is comparable with a model trained on all channels. Hence, our model's temporal and power costs are low, while its accuracy is kept high.
MMMay 11, 2020
Hardware Implementation of Adaptive Watermarking Based on Local Spatial Disorder AnalysisMohsen Hajabdolahi, Nader Karimi, Shahram Shirani et al.
With the increasing use of the internet and the ease of exchange of multimedia content, the protection of ownership rights has become a significant concern. Watermarking is an efficient means for this purpose. In many applications, real-time watermarking is required, which demands hardware implementation of low complexity and robust algorithm. In this paper, an adaptive watermarking is presented, which uses embedding in different bit-planes to achieve transparency and robustness. Local disorder of pixels is analyzed to control the strength of the watermark. A new low complexity method for disorder analysis is proposed, and its hardware implantation is presented. An embedding method is proposed, which causes lower degradation in the watermarked image. Also, the performance of proposed watermarking architecture is improved by a pipe-line structure and is tested on an FPGA device. Results show that the algorithm produces transparent and robust watermarked images. The synthesis report from FPGA implementation illustrates a low complexity hardware structure.
CVMar 27, 2020
Acceleration of Convolutional Neural Network Using FFT-Based Split ConvolutionsKamran Chitsaz, Mohsen Hajabdollahi, Nader Karimi et al.
Convolutional neural networks (CNNs) have a large number of variables and hence suffer from a complexity problem for their implementation. Different methods and techniques have developed to alleviate the problem of CNN's complexity, such as quantization, pruning, etc. Among the different simplification methods, computation in the Fourier domain is regarded as a new paradigm for the acceleration of CNNs. Recent studies on Fast Fourier Transform (FFT) based CNN aiming at simplifying the computations required for FFT. However, there is a lot of space for working on the reduction of the computational complexity of FFT. In this paper, a new method for CNN processing in the FFT domain is proposed, which is based on input splitting. There are problems in the computation of FFT using small kernels in situations such as CNN. Splitting can be considered as an effective solution for such issues aroused by small kernels. Using splitting redundancy, such as overlap-and-add, is reduced and, efficiency is increased. Hardware implementation of the proposed FFT method, as well as different analyses of the complexity, are performed to demonstrate the proper performance of the proposed method.
IVFeb 26, 2020
Region of Interest Identification for Brain Tumors in Magnetic Resonance ImagesFateme Mostafaie, Reihaneh Teimouri, Zahra Nabizadeh et al.
Glioma is a common type of brain tumor, and accurate detection of it plays a vital role in the diagnosis and treatment process. Despite advances in medical image analyzing, accurate tumor segmentation in brain magnetic resonance (MR) images remains a challenge due to variations in tumor texture, position, and shape. In this paper, we propose a fast, automated method, with light computational complexity, to find the smallest bounding box around the tumor region. This region-of-interest can be used as a preprocessing step in training networks for subregion tumor segmentation. By adopting the outputs of this algorithm, redundant information is removed; hence the network can focus on learning notable features related to subregions' classes. The proposed method has six main stages, in which the brain segmentation is the most vital step. Expectation-maximization (EM) and K-means algorithms are used for brain segmentation. The proposed method is evaluated on the BraTS 2015 dataset, and the average gained DICE score is 0.73, which is an acceptable result for this application.
CVFeb 9, 2020
Unlabeled Data Deployment for Classification of Diabetic Retinopathy Images Using Knowledge TransferSajjad Abbasi, Mohsen Hajabdollahi, Nader Karimi et al.
Convolutional neural networks (CNNs) are extensively beneficial for medical image processing. Medical images are plentiful, but there is a lack of annotated data. Transfer learning is used to solve the problem of lack of labeled data and grants CNNs better training capability. Transfer learning can be used in many different medical applications; however, the model under transfer should have the same size as the original network. Knowledge distillation is recently proposed to transfer the knowledge of a model to another one and can be useful to cover the shortcomings of transfer learning. But some parts of the knowledge may not be distilled by knowledge distillation. In this paper, a novel knowledge distillation using transfer learning is proposed to transfer the whole knowledge of a model to another one. The proposed method can be beneficial and practical for medical image analysis in which a small number of labeled data are available. The proposed process is tested for diabetic retinopathy classification. Simulation results demonstrate that using the proposed method, knowledge of an extensive network can be transferred to a smaller model.
CVFeb 9, 2020
Splitting Convolutional Neural Network Structures for Efficient InferenceEmad MalekHosseini, Mohsen Hajabdollahi, Nader Karimi et al.
For convolutional neural networks (CNNs) that have a large volume of input data, memory management becomes a major concern. Memory cost reduction can be an effective way to deal with these problems that can be realized through different techniques such as feature map pruning, input data splitting, etc. Among various methods existing in this area of research, splitting the network structure is an interesting research field, and there are a few works done in this area. In this study, the problem of reducing memory utilization using network structure splitting is addressed. A new technique is proposed to split the network structure into small parts that consume lower memory than the original network. The split parts can be processed almost separately, which provides an essential role for better memory management. The split approach has been tested on two well-known network structures of VGG16 and ResNet18 for the classification of CIFAR10 images. Simulation results show that the splitting method reduces both the number of computational operations as well as the amount of memory consumption.
CVFeb 9, 2020
Convolutional Neural Network Pruning Using Filter AttenuationMorteza Mousa-Pasandi, Mohsen Hajabdollahi, Nader Karimi et al.
Filters are the essential elements in convolutional neural networks (CNNs). Filters are corresponded to the feature maps and form the main part of the computational and memory requirement for the CNN processing. In filter pruning methods, a filter with all of its components, including channels and connections, are removed. The removal of a filter can cause a drastic change in the network's performance. Also, the removed filters cannot come back to the network structure. We want to address these problems in this paper. We propose a CNN pruning method based on filter attenuation in which weak filters are not directly removed. Instead, weak filters are attenuated and gradually removed. In the proposed attenuation approach, weak filters are not abruptly removed, and there is a chance for these filters to return to the network. The filter attenuation method is assessed using the VGG model for the Cifar10 image classification task. Simulation results show that the filter attenuation works with different pruning criteria, and better results are obtained in comparison with the conventional pruning methods.
IVFeb 5, 2020
Brain Tumor Segmentation by Cascaded Deep Neural Networks Using Multiple Image ScalesZahra Sobhaninia, Safiyeh Rezaei, Nader Karimi et al.
Intracranial tumors are groups of cells that usually grow uncontrollably. One out of four cancer deaths is due to brain tumors. Early detection and evaluation of brain tumors is an essential preventive medical step that is performed by magnetic resonance imaging (MRI). Many segmentation techniques exist for this purpose. Low segmentation accuracy is the main drawback of existing methods. In this paper, we use a deep learning method to boost the accuracy of tumor segmentation in MR images. Cascade approach is used with multiple scales of images to induce both local and global views and help the network to reach higher accuracies. Our experimental results show that using multiple scales and the utilization of two cascade networks is advantageous.
CVJan 13, 2020
Modeling of Pruning Techniques for Deep Neural Networks SimplificationMorteza Mousa Pasandi, Mohsen Hajabdollahi, Nader Karimi et al.
Convolutional Neural Networks (CNNs) suffer from different issues, such as computational complexity and the number of parameters. In recent years pruning techniques are employed to reduce the number of operations and model size in CNNs. Different pruning methods are proposed, which are based on pruning the connections, channels, and filters. Various techniques and tricks accompany pruning methods, and there is not a unifying framework to model all the pruning methods. In this paper pruning methods are investigated, and a general model which is contained the majority of pruning techniques is proposed. The advantages and disadvantages of the pruning methods can be identified, and all of them can be summarized under this model. The final goal of this model is to provide a general approach for all of the pruning methods with different structures and applications.
CVJan 10, 2020
Image Inpainting by Multiscale Spline InterpolationGhazale Ghorbanzade, Zahra Nabizadeh, Nader Karimi et al.
Recovering the missing regions of an image is a task that is called image inpainting. Depending on the shape of missing areas, different methods are presented in the literature. One of the challenges of this problem is extracting features that lead to better results. Experimental results show that both global and local features are useful for this purpose. In this paper, we propose a multi-scale image inpainting method that utilizes both local and global features. The first step of this method is to determine how many scales we need to use, which depends on the width of the lines in the map of the missing region. Then we apply adaptive image inpainting to the damaged areas of the image, and the lost pixels are predicted. Each scale is inpainted and the result is resized to the original size. Then a voting process produces the final result. The proposed method is tested on damaged images with scratches and creases. The metric that we use to evaluate our approach is PSNR. On average, we achieved 1.2 dB improvement over some existing inpainting approaches.
MMJan 9, 2020
Adaptive Control of Embedding Strength in Image Watermarking using Neural NetworksMahnoosh Bagheri, Majid Mohrekesh, Nader Karimi et al.
Digital image watermarking has been widely used in different applications such as copyright protection of digital media, such as audio, image, and video files. Two opposing criteria of robustness and transparency are the goals of watermarking methods. In this paper, we propose a framework for determining the appropriate embedding strength factor. The framework can use most DWT and DCT based blind watermarking approaches. We use Mask R-CNN on the COCO dataset to find a good strength factor for each sub-block. Experiments show that this method is robust against different attacks and has good transparency.
CVDec 31, 2019
Image Seam-Carving by Controlling Positional Distribution of SeamsMahdi Ahmadi, Nader Karimi, Shadrokh Samavi
Image retargeting is a new image processing task that renders the change of aspect ratio in images. One of the most famous image-retargeting algorithms is seam-carving. Although seam-carving is fast and straightforward, it usually distorts the images. In this paper, we introduce a new seam-carving algorithm that not only has the simplicity of the original seam-carving but also lacks the usual unwanted distortion existed in the original method. The positional distribution of seams is introduced. We show that the proposed method outperforms the original seam-carving in terms of retargeted image quality assessment and seam coagulation measures.
CVDec 31, 2019
Modeling Neural Architecture Search Methods for Deep NetworksEmad Malekhosseini, Mohsen Hajabdollahi, Nader Karimi et al.
There are many research works on the designing of architectures for the deep neural networks (DNN), which are named neural architecture search (NAS) methods. Although there are many automatic and manual techniques for NAS problems, there is no unifying model in which these NAS methods can be explored and compared. In this paper, we propose a general abstraction model for NAS methods. By using the proposed framework, it is possible to compare different design approaches for categorizing and identifying critical areas of interest in designing DNN architectures. Also, under this framework, different methods in the NAS area are summarized; hence a better view of their advantages and disadvantages is possible.
CVDec 31, 2019
Modeling Teacher-Student Techniques in Deep Neural Networks for Knowledge DistillationSajjad Abbasi, Mohsen Hajabdollahi, Nader Karimi et al.
Knowledge distillation (KD) is a new method for transferring knowledge of a structure under training to another one. The typical application of KD is in the form of learning a small model (named as a student) by soft labels produced by a complex model (named as a teacher). Due to the novel idea introduced in KD, recently, its notion is used in different methods such as compression and processes that are going to enhance the model accuracy. Although different techniques are proposed in the area of KD, there is a lack of a model to generalize KD techniques. In this paper, various studies in the scope of KD are investigated and analyzed to build a general model for KD. All the methods and techniques in KD can be summarized through the proposed model. By utilizing the proposed model, different methods in KD are better investigated and explored. The advantages and disadvantages of different approaches in KD can be better understood and develop a new strategy for KD can be possible. Using the proposed model, different KD methods are represented in an abstract view.
CVDec 27, 2019
A General Framework for Saliency Detection MethodsFateme Mostafaie, Zahra Nabizadeh, Nader Karimi et al.
Saliency detection is one of the most challenging problems in image analysis and computer vision. Many approaches propose different architectures based on the psychological and biological properties of the human visual attention system. However, there is still no abstract framework that summarizes the existing methods. In this paper, we offered a general framework for saliency models, which consists of five main steps: pre-processing, feature extraction, saliency map generation, saliency map combination, and post-processing. Also, we study different saliency models containing each level and compare their performance. This framework helps researchers to have a comprehensive view of studying new methods.
CVDec 27, 2019
An Abstraction Model for Semantic Segmentation AlgorithmsReihaneh Teymoori, Zahra Nabizadeh, Nader Karimi et al.
Semantic segmentation classifies each pixel in the image. Due to its advantages, semantic segmentation is used in many tasks, such as cancer detection, robot-assisted surgery, satellite image analysis, and self-driving cars. Accuracy and efficiency are the two crucial goals for this purpose, and several state-of-the-art neural networks exist. By employing different techniques, new solutions have been presented in each method to increase efficiency and accuracy and reduce costs. However, the diversity of the implemented approaches for semantic segmentation makes it difficult for researchers to achieve a comprehensive view of the field. In this paper, an abstraction model for semantic segmentation offers a comprehensive view of the field. The proposed framework consists of four general blocks that cover the operation of the majority of semantic segmentation methods. We also compare different approaches and analyze each of the four abstraction blocks' importance in each method's operation.
CVDec 20, 2019
Saliency Based Fire Detection Using Texture and Color FeaturesMaedeh Jamali, Nader Karimi, Shadrokh Samavi
Due to industry deployment and extension of urban areas, early warning systems have an essential role in giving emergency. Fire is an event that can rapidly spread and cause injury, death, and damage. Early detection of fire could significantly reduce these injuries. Video-based fire detection is a low cost and fast method in comparison with conventional fire detectors. Most available fire detection methods have a high false-positive rate and low accuracy. In this paper, we increase accuracy by using spatial and temporal features. Captured video sequences are divided into Spatio-temporal blocks. Then a saliency map and combination of color and texture features are used for detecting fire regions. We use the HSV color model as a spatial feature and LBP-TOP for temporal processing of fire texture. Fire detection tests on publicly available datasets have shown the accuracy and robustness of the algorithm.
IVNov 3, 2019
Gland Segmentation in Histopathological Images by Deep Neural NetworkSafiye Rezaei, Ali Emami, Nader Karimi et al.
Histology method is vital in the diagnosis and prognosis of cancers and many other diseases. For the analysis of histopathological images, we need to detect and segment all gland structures. These images are very challenging, and the task of segmentation is even challenging for specialists. Segmentation of glands determines the grade of cancer such as colon, breast, and prostate. Given that deep neural networks have achieved high performance in medical images, we propose a method based on the LinkNet network for gland segmentation. We found the effects of using different loss functions. By using Warwick-Qu dataset, which contains two test sets and one train set, we show that our approach is comparable to state-of-the-art methods. Finally, it is shown that enhancing the gland edges and the use of hematoxylin components can improve the performance of the proposed model.
IVNov 3, 2019
Localization of Fetal Head in Ultrasound Images by Multiscale View and Deep Neural NetworksZahra Sobhaninia, Ali Emami, Nader Karimi et al.
One of the routine examinations that are used for prenatal care in many countries is ultrasound imaging. This procedure provides various information about fetus health and development, the progress of the pregnancy and, the baby's due date. Some of the biometric parameters of the fetus, like fetal head circumference (HC), must be measured to check the fetus's health and growth. In this paper, we investigated the effects of using multi-scale inputs in the network. We also propose a light convolutional neural network for automatic HC measurement. Experimental results on an ultrasound dataset of the fetus in different trimesters of pregnancy show that the segmentation accuracy and HC evaluations performed by a light convolutional neural network are comparable to deep convolutional neural networks. The proposed network has fewer parameters and requires less training time.
IVNov 3, 2019
Image Inpainting by Adaptive Fusion of Variable Spline InterpolationsZahra Nabizadeh, Ghazale Ghorbanzade, Nader Karimi et al.
There are many methods for image enhancement. Image inpainting is one of them which could be used in reconstruction and restoration of scratch images or editing images by adding or removing objects. According to its application, different algorithmic and learning methods are proposed. In this paper, the focus is on applications, which enhance the old and historical scratched images. For this purpose, we proposed an adaptive spline interpolation. In this method, a different number of neighbors in four directions are considered for each pixel in the lost block. In the previous methods, predicting the lost pixels that are on edges is the problem. To address this problem, we consider horizontal and vertical edge information. If the pixel is located on an edge, then we use the predicted value in that direction. In other situations, irrelevant predicted values are omitted, and the average of rest values is used as the value of the missing pixel. The method evaluates by PSNR and SSIM metrics on the Kodak dataset. The results show improvement in PSNR and SSIM compared to similar procedures. Also, the run time of the proposed method outperforms others.
MMNov 2, 2019
Robustness and Imperceptibility Enhancement in Watermarked Images by Color TransformationMaedeh Jamali, Mahnoosh Bagheri, Nader Karimi et al.
One of the effective methods for the preservation of copyright ownership of digital media is watermarking. Different watermarking techniques try to set a tradeoff between robustness and transparency of the process. In this research work, we have used color space conversion and frequency transform to achieve high robustness and transparency. Due to the distribution of image information in the RGB domain, we use the YUV color space, which concentrates the visual information in the Y channel. Embedding of the watermark is performed in the DCT coefficients of the specific wavelet subbands. Experimental results show high transparency and robustness of the proposed method.
MMNov 1, 2019
BlessMark: A Blind Diagnostically-Lossless Watermarking Framework for Medical Applications Based on Deep Neural NetworksHamidreza Zarrabi, Ali Emami, Pejman Khadivi et al.
Nowadays, with the development of public network usage, medical information is transmitted throughout the hospitals. The watermarking system can help for the confidentiality of medical information distributed over the internet. In medical images, regions-of-interest (ROI) contain diagnostic information. The watermark should be embedded only into non-regions-of-interest (NROI) to keep diagnostic information without distortion. Recently, ROI based watermarking has attracted the attention of the medical research community. The ROI map can be used as an embedding key for improving confidentiality protection purposes. However, in most existing works, the ROI map that is used for the embedding process must be sent as side-information along with the watermarked image. This side information is a disadvantage and makes the extraction process non-blind. Also, most existing algorithms do not recover NROI of the original cover image after the extraction of the watermark. In this paper, we propose a framework for blind diagnostically-lossless watermarking, which iteratively embeds only into NROI. The significance of the proposed framework is in satisfying the confidentiality of the patient information through a blind watermarking system, while it preserves diagnostic/medical information of the image throughout the watermarking process. A deep neural network is used to recognize the ROI map in the embedding, extraction, and recovery processes. In the extraction process, the same ROI map of the embedding process is recognized without requiring any additional information. Hence, the watermark is blindly extracted from the NROI.
CVOct 17, 2019
Context-Aware Saliency Detection for Image Retargeting Using Convolutional Neural NetworksMahdi Ahmadi, Nader Karimi, Shadrokh Samavi
Image retargeting is the task of making images capable of being displayed on screens with different sizes. This work should be done so that high-level visual information and low-level features such as texture remain as intact as possible to the human visual system, while the output image may have different dimensions. Thus, simple methods such as scaling and cropping are not adequate for this purpose. In recent years, researchers have tried to improve the existing retargeting methods and introduce new ones. However, a specific method cannot be utilized to retarget all types of images. In other words, different images require different retargeting methods. Image retargeting has a close relationship to image saliency detection, which is relatively a new image processing task. Earlier saliency detection methods were based on local and global but low-level image information. These methods are called bottom-up methods. On the other hand, newer approaches are top-down and mixed methods that consider the high level and semantic information of the image too. In this paper, we introduce the proposed methods in both saliency detection and retargeting. For the saliency detection, the use of image context and semantic segmentation are examined, and a novel mixed bottom-up, and top-down saliency detection method is introduced. After saliency detection, a modified version of an existing retargeting method is utilized for retargeting the images. The results suggest that the proposed image retargeting pipeline has excellent performance compared to other tested methods. Also, the subjective evaluations on the Pascal dataset can be used as a retargeting quality assessment dataset for further research.
IVAug 31, 2019
Fetal Ultrasound Image Segmentation for Measuring Biometric Parameters Using Multi-Task Deep LearningZahra Sobhaninia, Shima Rafiei, Ali Emami et al.
Ultrasound imaging is a standard examination during pregnancy that can be used for measuring specific biometric parameters towards prenatal diagnosis and estimating gestational age. Fetal head circumference (HC) is one of the significant factors to determine the fetus growth and health. In this paper, a multi-task deep convolutional neural network is proposed for automatic segmentation and estimation of HC ellipse by minimizing a compound cost function composed of segmentation dice score and MSE of ellipse parameters. Experimental results on fetus ultrasound dataset in different trimesters of pregnancy show that the segmentation results and the extracted HC match well with the radiologist annotations. The obtained dice scores of the fetal head segmentation and the accuracy of HC evaluations are comparable to the state-of-the-art.
IVAug 31, 2019
Gland Segmentation in Histopathology Images Using Deep Networks and Handcrafted FeaturesSafiyeh Rezaei, Ali Emami, Hamidreza Zarrabi et al.
Histopathology images contain essential information for medical diagnosis and prognosis of cancerous disease. Segmentation of glands in histopathology images is a primary step for analysis and diagnosis of an unhealthy patient. Due to the widespread application and the great success of deep neural networks in intelligent medical diagnosis and histopathology, we propose a modified version of LinkNet for gland segmentation and recognition of malignant cases. We show that using specific handcrafted features such as invariant local binary pattern drastically improves the system performance. The experimental results demonstrate the competency of the proposed system against state-of-the-art methods. We achieved the best results in testing on section B images of the Warwick-QU dataset and obtained comparable results on section A images.
MMOct 16, 2018
ReDMark: Framework for Residual Diffusion Watermarking on Deep NetworksMahdi Ahmadi, Alireza Norouzi, S. M. Reza Soroushmehr et al.
Due to the rapid growth of machine learning tools and specifically deep networks in various computer vision and image processing areas, application of Convolutional Neural Networks for watermarking have recently emerged. In this paper, we propose a deep end-to-end diffusion watermarking framework (ReDMark) which can be adapted for any desired transform space. The framework is composed of two Fully Convolutional Neural Networks with the residual structure for embedding and extraction. The whole deep network is trained end-to-end to conduct a blind secure watermarking. The framework is customizable for the level of robustness vs. imperceptibility. It is also adjustable for the trade-off between capacity and robustness. The proposed framework simulates various attacks as a differentiable network layer to facilitate end-to-end training. For JPEG attack, a differentiable approximation is utilized, which drastically improves the watermarking robustness to this attack. Another important characteristic of the proposed framework, which leads to improved security and robustness, is its capability to diffuse watermark information among a relatively wide area of the image. Comparative results versus recent state-of-the-art researches highlight the superiority of the proposed framework in terms of imperceptibility and robustness.
CVSep 22, 2018
Artistic Instance-Aware Image Filtering by Convolutional Neural NetworksMilad Tehrani, Mahnoosh Bagheri, Mahdi Ahmadi et al.
In the recent years, public use of artistic effects for editing and beautifying images has encouraged researchers to look for new approaches to this task. Most of the existing methods apply artistic effects to the whole image. Exploitation of neural network vision technologies like object detection and semantic segmentation could be a new viewpoint in this area. In this paper, we utilize an instance segmentation neural network to obtain a class mask for separately filtering the background and foreground of an image. We implement a top prior-mask selection to let us select an object class for filtering purpose. Different artistic effects are used in the filtering process to meet the requirements of a vast variety of users. Also, our method is flexible enough to allow the addition of new filters. We use pre-trained Mask R-CNN instance segmentation on the COCO dataset as the segmentation network. Experimental results on the use of different filters are performed. System's output results show that this novel approach can create satisfying artistic images with fast operation and simple interface.
CVSep 20, 2018
Brain Tumor Segmentation Using Deep Learning by Type Specific Sorting of ImagesZahra Sobhaninia, Safiyeh Rezaei, Alireza Noroozi et al.
Recently deep learning has been playing a major role in the field of computer vision. One of its applications is the reduction of human judgment in the diagnosis of diseases. Especially, brain tumor diagnosis requires high accuracy, where minute errors in judgment may lead to disaster. For this reason, brain tumor segmentation is an important challenge for medical purposes. Currently several methods exist for tumor segmentation but they all lack high accuracy. Here we present a solution for brain tumor segmenting by using deep learning. In this work, we studied different angles of brain MR images and applied different networks for segmentation. The effect of using separate networks for segmentation of MR images is evaluated by comparing the results with a single network. Experimental evaluations of the networks show that Dice score of 0.73 is achieved for a single network and 0.79 in obtained for multiple networks.