IVMar 28, 2023
Exploring Deep Learning Methods for Classification of SAR Images: Towards NextGen Convolutions via TransformersAakash Singh, Vivek Kumar Singh
Images generated by high-resolution SAR have vast areas of application as they can work better in adverse light and weather conditions. One such area of application is in the military systems. This study is an attempt to explore the suitability of current state-of-the-art models introduced in the domain of computer vision for SAR target classification (MSTAR). Since the application of any solution produced for military systems would be strategic and real-time, accuracy is often not the only criterion to measure its performance. Other important parameters like prediction time and input resiliency are equally important. The paper deals with these issues in the context of SAR images. Experimental results show that deep learning models can be suitably applied in the domain of SAR image classification with the desired performance levels.
CVJun 23, 2022
ICOS Protein Expression Segmentation: Can Transformer Networks Give Better Results?Vivek Kumar Singh, Paul O Reilly, Jacqueline James et al.
Biomarkers identify a patients response to treatment. With the recent advances in artificial intelligence based on the Transformer networks, there is only limited research has been done to measure the performance on challenging histopathology images. In this paper, we investigate the efficacy of the numerous state-of-the-art Transformer networks for immune-checkpoint biomarker, Inducible Tcell COStimulator (ICOS) protein cell segmentation in colon cancer from immunohistochemistry (IHC) slides. Extensive and comprehensive experimental results confirm that MiSSFormer achieved the highest Dice score of 74.85% than the rest evaluated Transformer and Efficient U-Net methods.
CLApr 3, 2023
Detection of Homophobia & Transphobia in Dravidian Languages: Exploring Deep Learning MethodsDeepawali Sharma, Vedika Gupta, Vivek Kumar Singh
The increase in abusive content on online social media platforms is impacting the social life of online users. Use of offensive and hate speech has been making so-cial media toxic. Homophobia and transphobia constitute offensive comments against LGBT+ community. It becomes imperative to detect and handle these comments, to timely flag or issue a warning to users indulging in such behaviour. However, automated detection of such content is a challenging task, more so in Dravidian languages which are identified as low resource languages. Motivated by this, the paper attempts to explore applicability of different deep learning mod-els for classification of the social media comments in Malayalam and Tamil lan-guages as homophobic, transphobic and non-anti-LGBT+content. The popularly used deep learning models- Convolutional Neural Network (CNN), Long Short Term Memory (LSTM) using GloVe embedding and transformer-based learning models (Multilingual BERT and IndicBERT) are applied to the classification problem. Results obtained show that IndicBERT outperforms the other imple-mented models, with obtained weighted average F1-score of 0.86 and 0.77 for Malayalam and Tamil, respectively. Therefore, the present work confirms higher performance of IndicBERT on the given task in selected Dravidian languages.
CYMar 21, 2025
Data to Decisions: A Computational Framework to Identify skill requirements from Advertorial DataAakash Singh, Anurag Kanaujia, Vivek Kumar Singh
Among the factors of production, human capital or skilled manpower is the one that keeps evolving and adapts to changing conditions and resources. This adaptability makes human capital the most crucial factor in ensuring a sustainable growth of industry/sector. As new technologies are developed and adopted, the new generations are required to acquire skills in newer technologies in order to be employable. At the same time professionals are required to upskill and reskill themselves to remain relevant in the industry. There is however no straightforward method to identify the skill needs of the industry at a given point of time. Therefore, this paper proposes a data to decision framework that can successfully identify the desired skill set in a given area by analysing the advertorial data collected from popular online job portals and supplied as input to the framework. The proposed framework uses techniques of statistical analysis, data mining and natural language processing for the purpose. The applicability of the framework is demonstrated on CS&IT job advertisement data from India. The analytical results not only provide useful insights about current state of skill needs in CS&IT industry but also provide practical implications to prospective job applicants, training agencies, and institutions of higher education & professional training.
GTFeb 25, 2025
Colored Jones Polynomials and the Volume ConjectureMark Hughes, Vishnu Jejjala, P. Ramadevi et al.
Using the vertex model approach for braid representations, we compute polynomials for spin-1 placed on hyperbolic knots up to 15 crossings. These polynomials are referred to as 3-colored Jones polynomials or adjoint Jones polynomials. Training a subset of the data using a fully connected feedforward neural network, we predict the volume of the knot complement of hyperbolic knots from the adjoint Jones polynomial or its evaluations with 99.34% accuracy. A function of the adjoint Jones polynomial evaluated at the phase $q=e^{ 8 πi / 15 }$ predicts the volume with nearly the same accuracy as the neural network. From an analysis of 2-colored and 3-colored Jones polynomials, we conjecture the best phase for $n$-colored Jones polynomials, and use this hypothesis to motivate an improved statement of the volume conjecture. This is tested for knots for which closed form expressions for the $n$-colored Jones polynomial are known, and we show improved convergence to the volume.
CVMar 19, 2024
A Hybrid Transformer-Sequencer approach for Age and Gender classification from in-wild facial imagesAakash Singh, Vivek Kumar Singh
The advancements in computer vision and image processing techniques have led to emergence of new application in the domain of visual surveillance, targeted advertisement, content-based searching, and human-computer interaction etc. Out of the various techniques in computer vision, face analysis, in particular, has gained much attention. Several previous studies have tried to explore different applications of facial feature processing for a variety of tasks, including age and gender classification. However, despite several previous studies having explored the problem, the age and gender classification of in-wild human faces is still far from the achieving the desired levels of accuracy required for real-world applications. This paper, therefore, attempts to bridge this gap by proposing a hybrid model that combines self-attention and BiLSTM approaches for age and gender classification problems. The proposed models performance is compared with several state-of-the-art model proposed so far. An improvement of approximately 10percent and 6percent over the state-of-the-art implementations for age and gender classification, respectively, are noted for the proposed model. The proposed model is thus found to achieve superior performance and is found to provide a more generalized learning. The model can, therefore, be applied as a core classification component in various image processing and computer vision problems.
IVMay 5, 2023
Breast Cancer Immunohistochemical Image Generation: a Benchmark Dataset and Challenge ReviewChuang Zhu, Shengjie Liu, Zekuan Yu et al.
For invasive breast cancer, immunohistochemical (IHC) techniques are often used to detect the expression level of human epidermal growth factor receptor-2 (HER2) in breast tissue to formulate a precise treatment plan. From the perspective of saving manpower, material and time costs, directly generating IHC-stained images from Hematoxylin and Eosin (H&E) stained images is a valuable research direction. Therefore, we held the breast cancer immunohistochemical image generation challenge, aiming to explore novel ideas of deep learning technology in pathological image generation and promote research in this field. The challenge provided registered H&E and IHC-stained image pairs, and participants were required to use these images to train a model that can directly generate IHC-stained images from corresponding H&E-stained images. We selected and reviewed the five highest-ranking methods based on their PSNR and SSIM metrics, while also providing overviews of the corresponding pipelines and implementations. In this paper, we further analyze the current limitations in the field of breast cancer immunohistochemical image generation and forecast the future development of this field. We hope that the released dataset and the challenge will inspire more scholars to jointly study higher-quality IHC-stained image generation.
CVJan 23, 2021
Network-Agnostic Knowledge Transfer for Medical Image SegmentationShuhang Wang, Vivek Kumar Singh, Alex Benjamin et al.
Conventional transfer learning leverages weights of pre-trained networks, but mandates the need for similar neural architectures. Alternatively, knowledge distillation can transfer knowledge between heterogeneous networks but often requires access to the original training data or additional generative networks. Knowledge transfer between networks can be improved by being agnostic to the choice of network architecture and reducing the dependence on original training data. We propose a knowledge transfer approach from a teacher to a student network wherein we train the student on an independent transferal dataset, whose annotations are generated by the teacher. Experiments were conducted on five state-of-the-art networks for semantic segmentation and seven datasets across three imaging modalities. We studied knowledge transfer from a single teacher, combination of knowledge transfer and fine-tuning, and knowledge transfer from multiple teachers. The student model with a single teacher achieved similar performance as the teacher; and the student model with multiple teachers achieved better performance than the teachers. The salient features of our algorithm include: 1)no need for original training data or generative networks, 2) knowledge transfer between different architectures, 3) ease of implementation for downstream tasks by using the downstream task dataset as the transferal dataset, 4) knowledge transfer of an ensemble of models, trained independently, into one student model. Extensive experiments demonstrate that the proposed algorithm is effective for knowledge transfer and easily tunable.
DLJul 12, 2020
India's rank and global share in scientific research -- how publication counting method and subject selection can vary the outcomesVivek Kumar Singh, Parveen Arora, Ashraf Uddin et al.
During the last two decades, India has emerged as a major knowledge producer in the world, however different reports put it at different ranks, varying from 3rd to 9th places. The recent commissioned study reports of Department of Science and Technology (DST) done by Elsevier and Clarivate Analytics, rank India at 5thand 9th places, respectively. On the other hand, an independent report by National Science Foundation (NSF) of United States (US), ranks India at 3rd place on research output in Science and Engineering area. Interestingly, both, the Elsevier and the NSF reports use Scopus data, and yet surprisingly their outcomes are different. This article, therefore, attempts to investigate as to how the use of same database can still produce different outcomes, due to differences in methodological approaches. The publication counting method used and the subject selection approach are the two main exogenous factors identified to cause these variations. The implications of the analytical outcomes are discussed with special focus on policy perspectives.
DLJul 12, 2020
India's rank and global share in scientific research -- how data sourced from different databases can produce varying outcomesPrashasti Singh, Vivek Kumar Singh, Parveen Arora et al.
India is emerging as a major knowledge producer of the world in terms of proportionate share of global research output and the overall research productivity rank. Many recent reports, both of commissioned studies from Government of India as well as independent international agencies, show India at different ranks of global research productivity (variations as large as from 3rd to 9th place). The paper examines this contradiction; tries to analyse as to why different reports places India at different ranks and what may be the reasons thereof. The research output data for India, along with the ten most productive countries in the world, is analysed from three major scholarly databases: Web of Science, Scopus and Dimensions for this purpose. Results show that both, the endogenous factors (such as database coverage variation and different subject classification schemes) and the exogenous factors (such as subject selection and publication counting methodology) cause the variations in different reports. This paper reports first part of the analysis, focusing mainly on variations due to use of data from different databases. The policy implications of the study are also discussed.
IVJul 5, 2019
Adversarial Learning with Multiscale Features and Kernel Factorization for Retinal Blood Vessel SegmentationFarhan Akram, Vivek Kumar Singh, Hatem A. Rashwan et al.
In this paper, we propose an efficient blood vessel segmentation method for the eye fundus images using adversarial learning with multiscale features and kernel factorization. In the generator network of the adversarial framework, spatial pyramid pooling, kernel factorization and squeeze excitation block are employed to enhance the feature representation in spatial domain on different scales with reduced computational complexity. In turn, the discriminator network of the adversarial framework is formulated by combining convolutional layers with an additional squeeze excitation block to differentiate the generated segmentation mask from its respective ground truth. Before feeding the images to the network, we pre-processed them by using edge sharpening and Gaussian regularization to reach an optimized solution for vessel segmentation. The output of the trained model is post-processed using morphological operations to remove the small speckles of noise. The proposed method qualitatively and quantitatively outperforms state-of-the-art vessel segmentation methods using DRIVE and STARE datasets.
IVJul 1, 2019
An Efficient Solution for Breast Tumor Segmentation and Classification in Ultrasound Images Using Deep Adversarial LearningVivek Kumar Singh, Hatem A. Rashwan, Mohamed Abdel-Nasser et al.
This paper proposes an efficient solution for tumor segmentation and classification in breast ultrasound (BUS) images. We propose to add an atrous convolution layer to the conditional generative adversarial network (cGAN) segmentation model to learn tumor features at different resolutions of BUS images. To automatically re-balance the relative impact of each of the highest level encoded features, we also propose to add a channel-wise weighting block in the network. In addition, the SSIM and L1-norm loss with the typical adversarial loss are used as a loss function to train the model. Our model outperforms the state-of-the-art segmentation models in terms of the Dice and IoU metrics, achieving top scores of 93.76% and 88.82%, respectively. In the classification stage, we show that few statistics features extracted from the shape of the boundaries of the predicted masks can properly discriminate between benign and malignant tumors with an accuracy of 85%$
IVJul 1, 2019
SLSNet: Skin lesion segmentation using a lightweight generative adversarial networkMd. Mostafa Kamal Sarker, Hatem A. Rashwan, Farhan Akram et al.
The determination of precise skin lesion boundaries in dermoscopic images using automated methods faces many challenges, most importantly, the presence of hair, inconspicuous lesion edges and low contrast in dermoscopic images, and variability in the color, texture and shapes of skin lesions. Existing deep learning-based skin lesion segmentation algorithms are expensive in terms of computational time and memory. Consequently, running such segmentation algorithms requires a powerful GPU and high bandwidth memory, which are not available in dermoscopy devices. Thus, this article aims to achieve precise skin lesion segmentation with minimum resources: a lightweight, efficient generative adversarial network (GAN) model called SLSNet, which combines 1-D kernel factorized networks, position and channel attention, and multiscale aggregation mechanisms with a GAN model. The 1-D kernel factorized network reduces the computational cost of 2D filtering. The position and channel attention modules enhance the discriminative ability between the lesion and non-lesion feature representations in spatial and channel dimensions, respectively. A multiscale block is also used to aggregate the coarse-to-fine features of input skin images and reduce the effect of the artifacts. SLSNet is evaluated on two publicly available datasets: ISBI 2017 and the ISIC 2018. Although SLSNet has only 2.35 million parameters, the experimental results demonstrate that it achieves segmentation results on a par with the state-of-the-art skin lesion segmentation methods with an accuracy of 97.61%, and Dice and Jaccard similarity coefficients of 90.63% and 81.98%, respectively. SLSNet can run at more than 110 frames per second (FPS) in a single GTX1080Ti GPU, which is faster than well-known deep learning-based image segmentation models, such as FCN. Therefore, SLSNet can be used for practical dermoscopic applications.
CVSep 5, 2018
Breast Tumor Segmentation and Shape Classification in Mammograms using Generative Adversarial and Convolutional Neural NetworkVivek Kumar Singh, Hatem A. Rashwan, Santiago Romani et al.
Mammogram inspection in search of breast tumors is a tough assignment that radiologists must carry out frequently. Therefore, image analysis methods are needed for the detection and delineation of breast masses, which portray crucial morphological information that will support reliable diagnosis. In this paper, we proposed a conditional Generative Adversarial Network (cGAN) devised to segment a breast mass within a region of interest (ROI) in a mammogram. The generative network learns to recognize the breast mass area and to create the binary mask that outlines the breast mass. In turn, the adversarial network learns to distinguish between real (ground truth) and synthetic segmentations, thus enforcing the generative network to create binary masks as realistic as possible. The cGAN works well even when the number of training samples are limited. Therefore, the proposed method outperforms several state-of-the-art approaches. This hypothesis is corroborated by diverse experiments performed on two datasets, the public INbreast and a private in-house dataset. The proposed segmentation model provides a high Dice coefficient and Intersection over Union (IoU) of 94% and 87%, respectively. In addition, a shape descriptor based on a Convolutional Neural Network (CNN) is proposed to classify the generated masks into four mass shapes: irregular, lobular, oval and round. The proposed shape descriptor was trained on Digital Database for Screening Mammography (DDSM) yielding an overall accuracy of 80%, which outperforms the current state-of-the-art.
CVJul 30, 2018
REFUGE CHALLENGE 2018-Task 2:Deep Optic Disc and Cup Segmentation in Fundus Images Using U-Net and Multi-scale Feature Matching NetworksVivek Kumar Singh, Hatem A. Rashwan, Adel Saleh et al.
In this paper, an optic disc and cup segmentation method is proposed using U-Net followed by a multi-scale feature matching network. The proposed method targets task 2 of the REFUGE challenge 2018. In order to solve the segmentation problem of task 2, we firstly crop the input image using single shot multibox detector (SSD). The cropped image is then passed to an encoder-decoder network with skip connections also known as generator. Afterwards, both the ground truth and generated images are fed to a convolution neural network (CNN) to extract their multi-level features. A dice loss function is then used to match the features of the two images by minimizing the error at each layer. The aggregation of error from each layer is back-propagated through the generator network to enforce it to generate a segmented image closer to the ground truth. The CNN network improves the performance of the generator network without increasing the complexity of the model.
CVJun 11, 2018
Retinal Optic Disc Segmentation using Conditional Generative Adversarial NetworkVivek Kumar Singh, Hatem Rashwan, Farhan Akram et al.
This paper proposed a retinal image segmentation method based on conditional Generative Adversarial Network (cGAN) to segment optic disc. The proposed model consists of two successive networks: generator and discriminator. The generator learns to map information from the observing input (i.e., retinal fundus color image), to the output (i.e., binary mask). Then, the discriminator learns as a loss function to train this mapping by comparing the ground-truth and the predicted output with observing the input image as a condition.Experiments were performed on two publicly available dataset; DRISHTI GS1 and RIM-ONE. The proposed model outperformed state-of-the-art-methods by achieving around 0.96% and 0.98% of Jaccard and Dice coefficients, respectively. Moreover, an image segmentation is performed in less than a second on recent GPU.
CVMay 25, 2018
SLSDeep: Skin Lesion Segmentation Based on Dilated Residual and Pyramid Pooling NetworksMd. Mostafa Kamal Sarker, Hatem A. Rashwan, Farhan Akram et al.
Skin lesion segmentation (SLS) in dermoscopic images is a crucial task for automated diagnosis of melanoma. In this paper, we present a robust deep learning SLS model, so-called SLSDeep, which is represented as an encoder-decoder network. The encoder network is constructed by dilated residual layers, in turn, a pyramid pooling network followed by three convolution layers is used for the decoder. Unlike the traditional methods employing a cross-entropy loss, we investigated a loss function by combining both Negative Log Likelihood (NLL) and End Point Error (EPE) to accurately segment the melanoma regions with sharp boundaries. The robustness of the proposed model was evaluated on two public databases: ISBI 2016 and 2017 for skin lesion analysis towards melanoma detection challenge. The proposed model outperforms the state-of-the-art methods in terms of segmentation accuracy. Moreover, it is capable to segment more than $100$ images of size 384x384 per second on a recent GPU.
CVMay 25, 2018
Conditional Generative Adversarial and Convolutional Networks for X-ray Breast Mass Segmentation and Shape ClassificationVivek Kumar Singh, Santiago Romani, Hatem A. Rashwan et al.
This paper proposes a novel approach based on conditional Generative Adversarial Networks (cGAN) for breast mass segmentation in mammography. We hypothesized that the cGAN structure is well-suited to accurately outline the mass area, especially when the training data is limited. The generative network learns intrinsic features of tumors while the adversarial network enforces segmentations to be similar to the ground truth. Experiments performed on dozens of malignant tumors extracted from the public DDSM dataset and from our in-house private dataset confirm our hypothesis with very high Dice coefficient and Jaccard index (>94% and >89%, respectively) outperforming the scores obtained by other state-of-the-art approaches. Furthermore, in order to detect portray significant morphological features of the segmented tumor, a specific Convolutional Neural Network (CNN) have also been designed for classifying the segmented tumor areas into four types (irregular, lobular, oval and round), which provides an overall accuracy about 72% with the DDSM dataset.