CVJan 19, 2024
Learned Image resizing with efficient training (LRET) facilitates improved performance of large-scale digital histopathology image classification modelsMd Zahangir Alom, Quynh T. Tran, Brent A. Orr
Histologic examination plays a crucial role in oncology research and diagnostics. The adoption of digital scanning of whole slide images (WSI) has created an opportunity to leverage deep learning-based image classification methods to enhance diagnosis and risk stratification. Technical limitations of current approaches to training deep convolutional neural networks (DCNN) result in suboptimal model performance and make training and deployment of comprehensive classification models unobtainable. In this study, we introduce a novel approach that addresses the main limitations of traditional histopathology classification model training. Our method, termed Learned Resizing with Efficient Training (LRET), couples efficient training techniques with image resizing to facilitate seamless integration of larger histology image patches into state-of-the-art classification models while preserving important structural information. We used the LRET method coupled with two distinct resizing techniques to train three diverse histology image datasets using multiple diverse DCNN architectures. Our findings demonstrate a significant enhancement in classification performance and training efficiency. Across the spectrum of experiments, LRET consistently outperforms existing methods, yielding a substantial improvement of 15-28% in accuracy for a large-scale, multiclass tumor classification task consisting of 74 distinct brain tumor types. LRET not only elevates classification accuracy but also substantially reduces training times, unlocking the potential for faster model development and iteration. The implications of this work extend to broader applications within medical imaging and beyond, where efficient integration of high-resolution images into deep learning pipelines is paramount for driving advancements in research and clinical practice.
IVMay 5, 2021
R2U3D: Recurrent Residual 3D U-Net for Lung SegmentationDhaval D. Kadia, Md Zahangir Alom, Ranga Burada et al.
3D lung segmentation is essential since it processes the volumetric information of the lungs, removes the unnecessary areas of the scan, and segments the actual area of the lungs in a 3D volume. Recently, the deep learning model, such as U-Net outperforms other network architectures for biomedical image segmentation. In this paper, we propose a novel model, namely, Recurrent Residual 3D U-Net (R2U3D), for the 3D lung segmentation task. In particular, the proposed model integrates 3D convolution into the Recurrent Residual Neural Network based on U-Net. It helps learn spatial dependencies in 3D and increases the propagation of 3D volumetric information. The proposed R2U3D network is trained on the publicly available dataset LUNA16 and it achieves state-of-the-art performance on both LUNA16 (testing set) and VESSEL12 dataset. In addition, we show that training the R2U3D model with a smaller number of CT scans, i.e., 100 scans, without applying data augmentation achieves an outstanding result in terms of Soft Dice Similarity Coefficient (Soft-DSC) of 0.9920.
IVJul 5, 2020
GanglionNet: Objectively Assess the Density and Distribution of Ganglion Cells With NABLA-N NetworkMd Zahangir Alom, Raj P. Kapur, TJ Browen et al.
Hirschsprungs disease (HD) is a birth defect which is diagnosed and managed by multiple medical specialties such as pediatric gastroenterology, surgery, radiology, and pathology. HD is characterized by absence of ganglion cells in the distal intestinal tract with a gradual normalization of ganglion cell numbers in adjacent upstream bowel, termed as the transition zone (TZ). Definitive surgical management to remove the abnormal bowel requires accurate assessment of ganglion cell density in histological sections from the TZ, which is difficult, time-consuming and prone to operator error. We present an automated method to detect and count immunostained ganglion cells using a new NABLA_N network based deep learning (DL) approach, called GanglionNet. The morphological image analysis methods are applied for refinement of the regions for counting of the cells and define ganglia regions (a set of ganglion cells) from the predicted masks. The proposed model is trained with single point annotated samples by the expert pathologist. The GanglionNet is tested on ten completely new High Power Field (HPF) images with dimension of 2560x1920 pixels and the outputs are compared against the manual counting results by the expert pathologist. The proposed method shows a robust 97.49% detection accuracy for ganglion cells, when compared to counts by the expert pathologist, which demonstrates the robustness of GanglionNet. The proposed DL based ganglion cell detection and counting method will simplify and standardize TZ diagnosis for HD patients.
IVApr 7, 2020
COVID_MTNet: COVID-19 Detection with Multi-Task Deep Learning ApproachesMd Zahangir Alom, M M Shaifur Rahman, Mst Shamima Nasrin et al.
COVID-19 is currently one the most life-threatening problems around the world. The fast and accurate detection of the COVID-19 infection is essential to identify, take better decisions and ensure treatment for the patients which will help save their lives. In this paper, we propose a fast and efficient way to identify COVID-19 patients with multi-task deep learning (DL) methods. Both X-ray and CT scan images are considered to evaluate the proposed technique. We employ our Inception Residual Recurrent Convolutional Neural Network with Transfer Learning (TL) approach for COVID-19 detection and our NABLA-N network model for segmenting the regions infected by COVID-19. The detection model shows around 84.67% testing accuracy from X-ray images and 98.78% accuracy in CT-images. A novel quantitative analysis strategy is also proposed in this paper to determine the percentage of infected regions in X-ray and CT images. The qualitative and quantitative results demonstrate promising results for COVID-19 detection and infected region localization.
CVApr 25, 2019
Skin Cancer Segmentation and Classification with NABLA-N and Inception Recurrent Residual Convolutional NetworksMd Zahangir Alom, Theus Aspiras, Tarek M. Taha et al.
In the last few years, Deep Learning (DL) has been showing superior performance in different modalities of biomedical image analysis. Several DL architectures have been proposed for classification, segmentation, and detection tasks in medical imaging and computational pathology. In this paper, we propose a new DL architecture, the NABLA-N network, with better feature fusion techniques in decoding units for dermoscopic image segmentation tasks. The NABLA-N network has several advances for segmentation tasks. First, this model ensures better feature representation for semantic segmentation with a combination of low to high-level feature maps. Second, this network shows better quantitative and qualitative results with the same or fewer network parameters compared to other methods. In addition, the Inception Recurrent Residual Convolutional Neural Network (IRRCNN) model is used for skin cancer classification. The proposed NABLA-N network and IRRCNN models are evaluated for skin cancer segmentation and classification on the benchmark datasets from the International Skin Imaging Collaboration 2018 (ISIC-2018). The experimental results show superior performance on segmentation tasks compared to the Recurrent Residual U-Net (R2U-Net). The classification model shows around 87% testing accuracy for dermoscopic skin cancer classification on ISIC2018 dataset.
CVApr 19, 2019
Advanced Deep Convolutional Neural Network Approaches for Digital Pathology Image Analysis: a comprehensive evaluation with different use casesMd Zahangir Alom, Theus Aspiras, Tarek M. Taha et al.
Deep Learning (DL) approaches have been providing state-of-the-art performance in different modalities in the field of medical imagining including Digital Pathology Image Analysis (DPIA). Out of many different DL approaches, Deep Convolutional Neural Network (DCNN) technique provides superior performance for classification, segmentation, and detection tasks. Most of the task in DPIA problems are somehow possible to solve with classification, segmentation, and detection approaches. In addition, sometimes pre and post-processing methods are applied for solving some specific type of problems. Recently, different DCNN models including Inception residual recurrent CNN (IRRCNN), Densely Connected Recurrent Convolution Network (DCRCN), Recurrent Residual U-Net (R2U-Net), and R2U-Net based regression model (UD-Net) have proposed and provide state-of-the-art performance for different computer vision and medical image analysis tasks. However, these advanced DCNN models have not been explored for solving different problems related to DPIA. In this study, we have applied these DCNN techniques for solving different DPIA problems and evaluated on different publicly available benchmark datasets for seven different tasks in digital pathology including lymphoma classification, Invasive Ductal Carcinoma (IDC) detection, nuclei segmentation, epithelium segmentation, tubule segmentation, lymphocyte detection, and mitosis detection. The experimental results are evaluated with different performance metrics such as sensitivity, specificity, accuracy, F1-score, Receiver Operating Characteristics (ROC) curve, dice coefficient (DC), and Means Squired Errors (MSE). The results demonstrate superior performance for classification, segmentation, and detection tasks compared to existing machine learning and DCNN based approaches.
CVNov 10, 2018
Breast Cancer Classification from Histopathological Images with Inception Recurrent Residual Convolutional Neural NetworkMd Zahangir Alom, Chris Yakopcic, Tarek M. Taha et al.
The Deep Convolutional Neural Network (DCNN) is one of the most powerful and successful deep learning approaches. DCNNs have already provided superior performance in different modalities of medical imaging including breast cancer classification, segmentation, and detection. Breast cancer is one of the most common and dangerous cancers impacting women worldwide. In this paper, we have proposed a method for breast cancer classification with the Inception Recurrent Residual Convolutional Neural Network (IRRCNN) model. The IRRCNN is a powerful DCNN model that combines the strength of the Inception Network (Inception-v4), the Residual Network (ResNet), and the Recurrent Convolutional Neural Network (RCNN). The IRRCNN shows superior performance against equivalent Inception Networks, Residual Networks, and RCNNs for object recognition tasks. In this paper, the IRRCNN approach is applied for breast cancer classification on two publicly available datasets including BreakHis and Breast Cancer Classification Challenge 2015. The experimental results are compared against the existing machine learning and deep learning-based approaches with respect to image-based, patch-based, image-level, and patient-level classification. The IRRCNN model provides superior classification performance in terms of sensitivity, Area Under the Curve (AUC), the ROC curve, and global accuracy compared to existing approaches for both datasets.
CVNov 8, 2018
Microscopic Nuclei Classification, Segmentation and Detection with improved Deep Convolutional Neural Network (DCNN) ApproachesMd Zahangir Alom, Chris Yakopcic, Tarek M. Taha et al.
Due to cellular heterogeneity, cell nuclei classification, segmentation, and detection from pathological images are challenging tasks. In the last few years, Deep Convolutional Neural Networks (DCNN) approaches have been shown state-of-the-art (SOTA) performance on histopathological imaging in different studies. In this work, we have proposed different advanced DCNN models and evaluated for nuclei classification, segmentation, and detection. First, the Densely Connected Recurrent Convolutional Network (DCRN) model is used for nuclei classification. Second, Recurrent Residual U-Net (R2U-Net) is applied for nuclei segmentation. Third, the R2U-Net regression model which is named UD-Net is used for nuclei detection from pathological images. The experiments are conducted with different datasets including Routine Colon Cancer(RCC) classification and detection dataset, and Nuclei Segmentation Challenge 2018 dataset. The experimental results show that the proposed DCNN models provide superior performance compared to the existing approaches for nuclei classification, segmentation, and detection tasks. The results are evaluated with different performance metrics including precision, recall, Dice Coefficient (DC), Means Squared Errors (MSE), F1-score, and overall accuracy. We have achieved around 3.4% and 4.5% better F-1 score for nuclei classification and detection tasks compared to recently published DCNN based method. In addition, R2U-Net shows around 92.15% testing accuracy in term of DC. These improved methods will help for pathological practices for better quantitative analysis of nuclei in Whole Slide Images(WSI) which ultimately will help for better understanding of different types of cancer in clinical workflow.
CVSep 4, 2018
Bangla License Plate Recognition Using Convolutional Neural Networks (CNN)M M Shaifur Rahman, Mst Shamima Nasrin, Moin Mostakim et al.
In the last few years, the deep learning technique in particular Convolutional Neural Networks (CNNs) is using massively in the field of computer vision and machine learning. This deep learning technique provides state-of-the-art accuracy in different classification, segmentation, and detection tasks on different benchmarks such as MNIST, CIFAR-10, CIFAR-100, Microsoft COCO, and ImageNet. However, there are a lot of research has been conducted for Bangla License plate recognition with traditional machine learning approaches in last decade. None of them are used to deploy a physical system for Bangla License Plate Recognition System (BLPRS) due to their poor recognition accuracy. In this paper, we have implemented CNNs based Bangla license plate recognition system with better accuracy that can be applied for different purposes including roadside assistance, automatic parking lot management system, vehicle license status detection and so on. Along with that, we have also created and released a very first and standard database for BLPRS.
CVMar 3, 2018
The History Began from AlexNet: A Comprehensive Survey on Deep Learning ApproachesMd Zahangir Alom, Tarek M. Taha, Christopher Yakopcic et al.
Deep learning has demonstrated tremendous success in variety of application domains in the past few years. This new field of machine learning has been growing rapidly and applied in most of the application domains with some new modalities of applications, which helps to open new opportunity. There are different methods have been proposed on different category of learning approaches, which includes supervised, semi-supervised and un-supervised learning. The experimental results show state-of-the-art performance of deep learning over traditional machine learning approaches in the field of Image Processing, Computer Vision, Speech Recognition, Machine Translation, Art, Medical imaging, Medical information processing, Robotics and control, Bio-informatics, Natural Language Processing (NLP), Cyber security, and many more. This report presents a brief survey on development of DL approaches, including Deep Neural Network (DNN), Convolutional Neural Network (CNN), Recurrent Neural Network (RNN) including Long Short Term Memory (LSTM) and Gated Recurrent Units (GRU), Auto-Encoder (AE), Deep Belief Network (DBN), Generative Adversarial Network (GAN), and Deep Reinforcement Learning (DRL). In addition, we have included recent development of proposed advanced variant DL techniques based on the mentioned DL approaches. Furthermore, DL approaches have explored and evaluated in different application domains are also included in this survey. We have also comprised recently developed frameworks, SDKs, and benchmark datasets that are used for implementing and evaluating deep learning approaches. There are some surveys have published on Deep Learning in Neural Networks [1, 38] and a survey on RL [234]. However, those papers have not discussed the individual advanced techniques for training large scale deep learning models and the recently developed method of generative models [1].
CVFeb 20, 2018
Recurrent Residual Convolutional Neural Network based on U-Net (R2U-Net) for Medical Image SegmentationMd Zahangir Alom, Mahmudul Hasan, Chris Yakopcic et al.
Deep learning (DL) based semantic segmentation methods have been providing state-of-the-art performance in the last few years. More specifically, these techniques have been successfully applied to medical image classification, segmentation, and detection tasks. One deep learning technique, U-Net, has become one of the most popular for these applications. In this paper, we propose a Recurrent Convolutional Neural Network (RCNN) based on U-Net as well as a Recurrent Residual Convolutional Neural Network (RRCNN) based on U-Net models, which are named RU-Net and R2U-Net respectively. The proposed models utilize the power of U-Net, Residual Network, as well as RCNN. There are several advantages of these proposed architectures for segmentation tasks. First, a residual unit helps when training deep architecture. Second, feature accumulation with recurrent residual convolutional layers ensures better feature representation for segmentation tasks. Third, it allows us to design better U-Net architecture with same number of network parameters with better performance for medical image segmentation. The proposed models are tested on three benchmark datasets such as blood vessel segmentation in retina images, skin cancer segmentation, and lung lesion segmentation. The experimental results show superior performance on segmentation tasks compared to equivalent models including U-Net and residual U-Net (ResU-Net).
CVFeb 7, 2018
Effective Quantization Approaches for Recurrent Neural NetworksMd Zahangir Alom, Adam T Moody, Naoya Maruyama et al.
Deep learning, and in particular Recurrent Neural Networks (RNN) have shown superior accuracy in a large variety of tasks including machine translation, language understanding, and movie frame generation. However, these deep learning approaches are very expensive in terms of computation. In most cases, Graphic Processing Units (GPUs) are in used for large scale implementations. Meanwhile, energy efficient RNN approaches are proposed for deploying solutions on special purpose hardware including Field Programming Gate Arrays (FPGAs) and mobile platforms. In this paper, we propose an effective quantization approach for Recurrent Neural Networks (RNN) techniques including Long Short Term Memory (LSTM), Gated Recurrent Units (GRU), and Convolutional Long Short Term Memory (ConvLSTM). We have implemented different quantization methods including Binary Connect {-1, 1}, Ternary Connect {-1, 0, 1}, and Quaternary Connect {-1, -0.5, 0.5, 1}. These proposed approaches are evaluated on different datasets for sentiment analysis on IMDB and video frame predictions on the moving MNIST dataset. The experimental results are compared against the full precision versions of the LSTM, GRU, and ConvLSTM. They show promising results for both sentiment analysis and video frame prediction.
CVFeb 7, 2018
Deep Versus Wide Convolutional Neural Networks for Object Recognition on Neuromorphic SystemMd Zahangir Alom, Theodore Josue, Md Nayim Rahman et al.
In the last decade, special purpose computing systems, such as Neuromorphic computing, have become very popular in the field of computer vision and machine learning for classification tasks. In 2015, IBM's released the TrueNorth Neuromorphic system, kick-starting a new era of Neuromorphic computing. Alternatively, Deep Learning approaches such as Deep Convolutional Neural Networks (DCNN) show almost human-level accuracies for detection and classification tasks. IBM's 2016 release of a deep learning framework for DCNNs, called Energy Efficient Deep Neuromorphic Networks (Eedn). Eedn shows promise for delivering high accuracies across a number of different benchmarks, while consuming very low power, using IBM's TrueNorth chip. However, there are many things that remained undiscovered using the Eedn framework for classification tasks on a Neuromorphic system. In this paper, we have empirically evaluated the performance of different DCNN architectures implemented within the Eedn framework. The goal of this work was discover the most efficient way to implement DCNN models for object classification tasks using the TrueNorth system. We performed our experiments using benchmark data sets such as MNIST, COIL 20, and COIL 100. The experimental results show very promising classification accuracies with very low power consumption on IBM's NS1e Neurosynaptic system. The results show that for datasets with large numbers of classes, wider networks perform better when compared to deep networks comprised of nearly the same core complexity on IBM's TrueNorth system.
CVDec 28, 2017
Improved Inception-Residual Convolutional Neural Network for Object RecognitionMd Zahangir Alom, Mahmudul Hasan, Chris Yakopcic et al.
Machine learning and computer vision have driven many of the greatest advances in the modeling of Deep Convolutional Neural Networks (DCNNs). Nowadays, most of the research has been focused on improving recognition accuracy with better DCNN models and learning approaches. The recurrent convolutional approach is not applied very much, other than in a few DCNN architectures. On the other hand, Inception-v4 and Residual networks have promptly become popular among computer the vision community. In this paper, we introduce a new DCNN model called the Inception Recurrent Residual Convolutional Neural Network (IRRCNN), which utilizes the power of the Recurrent Convolutional Neural Network (RCNN), the Inception network, and the Residual network. This approach improves the recognition accuracy of the Inception-residual network with same number of network parameters. In addition, this proposed architecture generalizes the Inception network, the RCNN, and the Residual network with significantly improved training accuracy. We have empirically evaluated the performance of the IRRCNN model on different benchmarks including CIFAR-10, CIFAR-100, TinyImageNet-200, and CU3D-100. The experimental results show higher recognition accuracy against most of the popular DCNN models including the RCNN. We have also investigated the performance of the IRRCNN approach against the Equivalent Inception Network (EIN) and the Equivalent Inception Residual Network (EIRN) counterpart on the CIFAR-100 dataset. We report around 4.53%, 4.49% and 3.56% improvement in classification accuracy compared with the RCNN, EIN, and EIRN on the CIFAR-100 dataset respectively. Furthermore, the experiment has been conducted on the TinyImageNet-200 and CU3D-100 datasets where the IRRCNN provides better testing accuracy compared to the Inception Recurrent CNN (IRCNN), the EIN, and the EIRN.
CVDec 28, 2017
Handwritten Bangla Character Recognition Using The State-of-Art Deep Convolutional Neural NetworksMd Zahangir Alom, Peheding Sidike, Mahmudul Hasan et al.
In spite of advances in object recognition technology, Handwritten Bangla Character Recognition (HBCR) remains largely unsolved due to the presence of many ambiguous handwritten characters and excessively cursive Bangla handwritings. Even the best existing recognizers do not lead to satisfactory performance for practical applications related to Bangla character recognition and have much lower performance than those developed for English alpha-numeric characters. To improve the performance of HBCR, we herein present the application of the state-of-the-art Deep Convolutional Neural Networks (DCNN) including VGG Network, All Convolution Network (All-Conv Net), Network in Network (NiN), Residual Network, FractalNet, and DenseNet for HBCR. The deep learning approaches have the advantage of extracting and using feature information, improving the recognition of 2D shapes with a high degree of invariance to translation, scaling and other distortions. We systematically evaluated the performance of DCNN models on publicly available Bangla handwritten character dataset called CMATERdb and achieved the superior recognition accuracy when using DCNN models. This improvement would help in building an automatic HBCR system for practical applications.
CVMay 7, 2017
Handwritten Bangla Digit Recognition Using Deep LearningMd Zahangir Alom, Paheding Sidike, Tarek M. Taha et al.
In spite of the advances in pattern recognition technology, Handwritten Bangla Character Recognition (HBCR) (such as alpha-numeric and special characters) remains largely unsolved due to the presence of many perplexing characters and excessive cursive in Bangla handwriting. Even the best existing recognizers do not lead to satisfactory performance for practical applications. To improve the performance of Handwritten Bangla Digit Recognition (HBDR), we herein present a new approach based on deep neural networks which have recently shown excellent performance in many pattern recognition and machine learning applications, but has not been throughly attempted for HBDR. We introduce Bangla digit recognition techniques based on Deep Belief Network (DBN), Convolutional Neural Networks (CNN), CNN with dropout, CNN with dropout and Gaussian filters, and CNN with dropout and Gabor filters. These networks have the advantage of extracting and using feature information, improving the recognition of two dimensional shapes with a high degree of invariance to translation, scaling and other pattern distortions. We systematically evaluated the performance of our method on publicly available Bangla numeral image database named CMATERdb 3.1.1. From experiments, we achieved 98.78% recognition rate using the proposed method: CNN with Gabor features and dropout, which outperforms the state-of-the-art algorithms for HDBR.
CVApr 25, 2017
Inception Recurrent Convolutional Neural Network for Object RecognitionMd Zahangir Alom, Mahmudul Hasan, Chris Yakopcic et al.
Deep convolutional neural networks (DCNNs) are an influential tool for solving various problems in the machine learning and computer vision fields. In this paper, we introduce a new deep learning model called an Inception- Recurrent Convolutional Neural Network (IRCNN), which utilizes the power of an inception network combined with recurrent layers in DCNN architecture. We have empirically evaluated the recognition performance of the proposed IRCNN model using different benchmark datasets such as MNIST, CIFAR-10, CIFAR- 100, and SVHN. Experimental results show similar or higher recognition accuracy when compared to most of the popular DCNNs including the RCNN. Furthermore, we have investigated IRCNN performance against equivalent Inception Networks and Inception-Residual Networks using the CIFAR-100 dataset. We report about 3.5%, 3.47% and 2.54% improvement in classification accuracy when compared to the RCNN, equivalent Inception Networks, and Inception- Residual Networks on the augmented CIFAR- 100 dataset respectively.
CVApr 21, 2017
Robust Multi-view Pedestrian Tracking Using Neural NetworksMd Zahangir Alom, Tarek M. Taha
In this paper, we present a real-time robust multi-view pedestrian detection and tracking system for video surveillance using neural networks which can be used in dynamic environments. The proposed system consists of two phases: multi-view pedestrian detection and tracking. First, pedestrian detection utilizes background subtraction to segment the foreground blob. An adaptive background subtraction method where each of the pixel of input image models as a mixture of Gaussians and uses an on-line approximation to update the model applies to extract the foreground region. The Gaussian distributions are then evaluated to determine which are most likely to result from a background process. This method produces a steady, real-time tracker in outdoor environment that consistently deals with changes of lighting condition, and long-term scene change. Second, the Tracking is performed at two phases: pedestrian classification and tracking the individual subject. A sliding window is applied on foreground binary image to select an input window which is used for selecting the input image patches from actually input frame. The neural networks is used for classification with PHOG features. Finally, a Kalman filter is applied to calculate the subsequent step for tracking that aims at finding the exact position of pedestrians in an input image. The experimental result shows that the proposed approach yields promising performance on multi-view pedestrian detection and tracking on different benchmark datasets.