Kumar Abhishek

CV
h-index15
29papers
1,408citations
Novelty39%
AI Score53

29 Papers

IVJun 1, 2022Code
A Survey on Deep Learning for Skin Lesion Segmentation

Zahra Mirikharaji, Kumar Abhishek, Alceu Bissoto et al.

Skin cancer is a major public health problem that could benefit from computer-aided diagnosis to reduce the burden of this common disease. Skin lesion segmentation from images is an important step toward achieving this goal. However, the presence of natural and artificial artifacts (e.g., hair and air bubbles), intrinsic factors (e.g., lesion shape and contrast), and variations in image acquisition conditions make skin lesion segmentation a challenging task. Recently, various researchers have explored the applicability of deep learning models to skin lesion segmentation. In this survey, we cross-examine 177 research papers that deal with deep learning-based segmentation of skin lesions. We analyze these works along several dimensions, including input data (datasets, preprocessing, and synthetic data generation), model design (architecture, modules, and losses), and evaluation aspects (data annotation requirements and segmentation performance). We discuss these dimensions both from the viewpoint of select seminal works, and from a systematic viewpoint, examining how those choices have influenced current trends, and how their limitations should be addressed. To facilitate comparisons, we summarize all examined works in a comprehensive table as well as an interactive table available online at https://github.com/sfu-mial/skin-lesion-segmentation-survey.

22.5CVJun 1
Quality-Guided Semi-Supervised Learning for Medical Image Segmentation

Kumar Abhishek, Ghassan Hamarneh

Training accurate medical image segmentation models requires large amounts of densely annotated data, which is costly and time-consuming to obtain. Semi-supervised learning (SSL) alleviates this by learning from both abundant unlabeled data and limited labeled data. However, most modern SSL methods rely on pseudolabels for unlabeled data, and typically assess their reliability through model confidence or uncertainty, measures that are self-referential and lack explicit grounding in segmentation quality. Instead, we propose a quality-guided SSL framework that trains a dedicated network to estimate segmentation quality from image-mask pairs. The predictor is trained on variable-quality masks generated through synthetic corruptions augmented with imperfect outputs from partially trained segmentation models, capturing realistic error patterns encountered during training. We integrate the quality predictor into SSL through two complementary mechanisms: a quality-aware regularization loss and a quality-based pseudolabel sample reweighting scheme. We show that our method serves as a drop-in enhancement to existing SSL frameworks. Extensive experiments across five datasets and multiple architectures demonstrate consistent improvements over competing SSL methods, advancing the state-of-the-art in semi-supervised medical image segmentation.

CVAug 29, 2022
CIRCLe: Color Invariant Representation Learning for Unbiased Classification of Skin Lesions

Arezou Pakzad, Kumar Abhishek, Ghassan Hamarneh

While deep learning based approaches have demonstrated expert-level performance in dermatological diagnosis tasks, they have also been shown to exhibit biases toward certain demographic attributes, particularly skin types (e.g., light versus dark), a fairness concern that must be addressed. We propose CIRCLe, a skin color invariant deep representation learning method for improving fairness in skin lesion classification. CIRCLe is trained to classify images by utilizing a regularization loss that encourages images with the same diagnosis but different skin types to have similar latent representations. Through extensive evaluation and ablation studies, we demonstrate CIRCLe's superior performance over the state-of-the-art when evaluated on 16k+ images spanning 6 Fitzpatrick skin types and 114 diseases, using classification accuracy, equal opportunity difference (for light versus dark groups), and normalized accuracy range, a new measure we propose to assess fairness on multiple skin type groups.

CVAug 5, 2024Code
Segmentation Style Discovery: Application to Skin Lesion Images

Kumar Abhishek, Jeremy Kawahara, Ghassan Hamarneh

Variability in medical image segmentation, arising from annotator preferences, expertise, and their choice of tools, has been well documented. While the majority of multi-annotator segmentation approaches focus on modeling annotator-specific preferences, they require annotator-segmentation correspondence. In this work, we introduce the problem of segmentation style discovery, and propose StyleSeg, a segmentation method that learns plausible, diverse, and semantically consistent segmentation styles from a corpus of image-mask pairs without any knowledge of annotator correspondence. StyleSeg consistently outperforms competing methods on four publicly available skin lesion segmentation (SLS) datasets. We also curate ISIC-MultiAnnot, the largest multi-annotator SLS dataset with annotator correspondence, and our results show a strong alignment, using our newly proposed measure AS2, between the predicted styles and annotator preferences. The code and the dataset are available at https://github.com/sfu-mial/StyleSeg.

CVAug 5, 2024Code
Lesion Elevation Prediction from Skin Images Improves Diagnosis

Kumar Abhishek, Ghassan Hamarneh

While deep learning-based computer-aided diagnosis for skin lesion image analysis is approaching dermatologists' performance levels, there are several works showing that incorporating additional features such as shape priors, texture, color constancy, and illumination further improves the lesion diagnosis performance. In this work, we look at another clinically useful feature, skin lesion elevation, and investigate the feasibility of predicting and leveraging skin lesion elevation labels. Specifically, we use a deep learning model to predict image-level lesion elevation labels from 2D skin lesion images. We test the elevation prediction accuracy on the derm7pt dataset, and use the elevation prediction model to estimate elevation labels for images from five other datasets: ISIC 2016, 2017, and 2018 Challenge datasets, MSK, and DermoFit. We evaluate cross-domain generalization by using these estimated elevation labels as auxiliary inputs to diagnosis models, and show that these improve the classification performance, with AUROC improvements of up to 6.29% and 2.69% for dermoscopic and clinical images, respectively. The code is publicly available at https://github.com/sfu-mial/LesionElevation.

LGApr 7, 2022
Multi-Sample $ζ$-mixup: Richer, More Realistic Synthetic Samples from a $p$-Series Interpolant

Kumar Abhishek, Colin J. Brown, Ghassan Hamarneh

Modern deep learning training procedures rely on model regularization techniques such as data augmentation methods, which generate training samples that increase the diversity of data and richness of label information. A popular recent method, mixup, uses convex combinations of pairs of original samples to generate new samples. However, as we show in our experiments, mixup can produce undesirable synthetic samples, where the data is sampled off the manifold and can contain incorrect labels. We propose $ζ$-mixup, a generalization of mixup with provably and demonstrably desirable properties that allows convex combinations of $N \geq 2$ samples, leading to more realistic and diverse outputs that incorporate information from $N$ original samples by using a $p$-series interpolant. We show that, compared to mixup, $ζ$-mixup better preserves the intrinsic dimensionality of the original datasets, which is a desirable property for training generalizable models. Furthermore, we show that our implementation of $ζ$-mixup is faster than mixup, and extensive evaluation on controlled synthetic and 24 real-world natural and medical image classification datasets shows that $ζ$-mixup outperforms mixup and traditional data augmentation techniques.

CVNov 27, 2022
Attribution-based XAI Methods in Computer Vision: A Review

Kumar Abhishek, Deeksha Kamath

The advancements in deep learning-based methods for visual perception tasks have seen astounding growth in the last decade, with widespread adoption in a plethora of application areas from autonomous driving to clinical decision support systems. Despite their impressive performance, these deep learning-based models remain fairly opaque in their decision-making process, making their deployment in human-critical tasks a risky endeavor. This in turn makes understanding the decisions made by these models crucial for their reliable deployment. Explainable AI (XAI) methods attempt to address this by offering explanations for such black-box deep learning methods. In this paper, we provide a comprehensive survey of attribution-based XAI methods in computer vision and review the existing literature for gradient-based, perturbation-based, and contrastive methods for XAI, and provide insights on the key challenges in developing and evaluating robust XAI methods.

AIApr 3, 2022
Semantic Sensor Network Ontology based Decision Support System for Forest Fire Management

Ritesh Chandra, Kumar Abhishek, Sonali Agarwal et al.

The forests are significant assets for every country. When it gets destroyed, it may negatively impact the environment, and forest fire is one of the primary causes. Fire weather indices are widely used to measure fire danger and are used to issue bushfire warnings. It can also be used to predict the demand for emergency management resources. Sensor networks have grown in popularity in data collection and processing capabilities for a variety of applications in industries such as medical, environmental monitoring, home automation etc. Semantic sensor networks can collect various climatic circumstances like wind speed, temperature, and relative humidity. However, estimating fire weather indices is challenging due to the various issues involved in processing the data streams generated by the sensors. Hence, the importance of forest fire detection has increased day by day. The underlying Semantic Sensor Network (SSN) ontologies are built to allow developers to create rules for calculating fire weather indices and also the convert dataset into Resource Description Framework (RDF). This research describes the various steps involved in developing rules for calculating fire weather indices. Besides, this work presents a Web-based mapping interface to help users visualize the changes in fire weather indices over time. With the help of the inference rule, it designed a decision support system using the SSN ontology and query on it through SPARQL. The proposed fire management system acts according to the situation, supports reasoning and the general semantics of the open-world followed by all the ontologies

CVDec 25, 2025
IMA++: ISIC Archive Multi-Annotator Dermoscopic Skin Lesion Segmentation Dataset

Kumar Abhishek, Jeremy Kawahara, Ghassan Hamarneh

Multi-annotator medical image segmentation is an important research problem, but requires annotated datasets that are expensive to collect. Dermoscopic skin lesion imaging allows human experts and AI systems to observe morphological structures otherwise not discernable from regular clinical photographs. However, currently there are no large-scale publicly available multi-annotator skin lesion segmentation (SLS) datasets with annotator-labels for dermoscopic skin lesion imaging. We introduce ISIC MultiAnnot++, a large public multi-annotator skin lesion segmentation dataset for images from the ISIC Archive. The final dataset contains 17,684 segmentation masks spanning 14,967 dermoscopic images, where 2,394 dermoscopic images have 2-5 segmentations per image, making it the largest publicly available SLS dataset. Further, metadata about the segmentation, including the annotators' skill level and segmentation tool, is included, enabling research on topics such as annotator-specific preference modeling for segmentation and annotator metadata analysis. We provide an analysis on the characteristics of this dataset, curated data partitions, and consensus segmentation masks.

CVAug 12, 2025Code
What Can We Learn from Inter-Annotator Variability in Skin Lesion Segmentation?

Kumar Abhishek, Jeremy Kawahara, Ghassan Hamarneh

Medical image segmentation exhibits intra- and inter-annotator variability due to ambiguous object boundaries, annotator preferences, expertise, and tools, among other factors. Lesions with ambiguous boundaries, e.g., spiculated or infiltrative nodules, or irregular borders per the ABCD rule, are particularly prone to disagreement and are often associated with malignancy. In this work, we curate IMA++, the largest multi-annotator skin lesion segmentation dataset, on which we conduct an in-depth study of variability due to annotator, malignancy, tool, and skill factors. We find a statistically significant (p<0.001) association between inter-annotator agreement (IAA), measured using Dice, and the malignancy of skin lesions. We further show that IAA can be accurately predicted directly from dermoscopic images, achieving a mean absolute error of 0.108. Finally, we leverage this association by utilizing IAA as a "soft" clinical feature within a multi-task learning objective, yielding a 4.2% improvement in balanced accuracy averaged across multiple model architectures and across IMA++ and four public dermoscopic datasets. The code is available at https://github.com/sfu-mial/skin-IAV.

IVMay 22, 2023Code
DermSynth3D: Synthesis of in-the-wild Annotated Dermatology Images

Ashish Sinha, Jeremy Kawahara, Arezou Pakzad et al.

In recent years, deep learning (DL) has shown great potential in the field of dermatological image analysis. However, existing datasets in this domain have significant limitations, including a small number of image samples, limited disease conditions, insufficient annotations, and non-standardized image acquisitions. To address these shortcomings, we propose a novel framework called DermSynth3D. DermSynth3D blends skin disease patterns onto 3D textured meshes of human subjects using a differentiable renderer and generates 2D images from various camera viewpoints under chosen lighting conditions in diverse background scenes. Our method adheres to top-down rules that constrain the blending and rendering process to create 2D images with skin conditions that mimic in-the-wild acquisitions, ensuring more meaningful results. The framework generates photo-realistic 2D dermoscopy images and the corresponding dense annotations for semantic segmentation of the skin, skin conditions, body parts, bounding boxes around lesions, depth maps, and other 3D scene parameters, such as camera position and lighting conditions. DermSynth3D allows for the creation of custom datasets for various dermatology tasks. We demonstrate the effectiveness of data generated using DermSynth3D by training DL models on synthetic data and evaluating them on various dermatology tasks using real 2D dermatological images. We make our code publicly available at https://github.com/sfu-mial/DermSynth3D.

IVOct 26, 2020Code
Matthews Correlation Coefficient Loss for Deep Convolutional Networks: Application to Skin Lesion Segmentation

Kumar Abhishek, Ghassan Hamarneh

The segmentation of skin lesions is a crucial task in clinical decision support systems for the computer aided diagnosis of skin lesions. Although deep learning-based approaches have improved segmentation performance, these models are often susceptible to class imbalance in the data, particularly, the fraction of the image occupied by the background healthy skin. Despite variations of the popular Dice loss function being proposed to tackle the class imbalance problem, the Dice loss formulation does not penalize misclassifications of the background pixels. We propose a novel metric-based loss function using the Matthews correlation coefficient, a metric that has been shown to be efficient in scenarios with skewed class distributions, and use it to optimize deep segmentation models. Evaluations on three skin lesion image datasets: the ISBI ISIC 2017 Skin Lesion Segmentation Challenge dataset, the DermoFit Image Library, and the PH2 dataset, show that models trained using the proposed loss function outperform those trained using Dice loss by 11.25%, 4.87%, and 0.76% respectively in the mean Jaccard index. The code is available at https://github.com/kakumarabhishek/MCC-Loss.

CYAug 12, 2025
Ethical Medical Image Synthesis

Weina Jin, Ashish Sinha, Kumar Abhishek et al.

The task of ethical Medical Image Synthesis (MISyn) is to ensure that the MISyn techniques are researched and developed ethically throughout their entire lifecycle, which is essential to prevent the negative impacts of MISyn. To address the ever-increasing needs and requirements for ethical practice of MISyn research and development, we first conduct a theoretical analysis that identifies the key properties of ethical MISyn and intrinsic limits of MISyn. We identify that synthetic images lack inherent grounding in real medical phenomena, cannot fully represent the training medical images, and inevitably introduce new distribution shifts and biases. Ethical risks can arise from not acknowledging the intrinsic limits and weaknesses of synthetic images compared to medical images, with the extreme form manifested as misinformation of MISyn that substitutes synthetic images for medical images without acknowledgment. The resulting ethical harms include eroding trust in the medical imaging dataset environment and causing algorithmic discrimination towards stakeholders and the public. To facilitate collective efforts towards ethical MISyn within and outside the medical image analysis community, we then propose practical supports for ethical practice in MISyn based on the theoretical analysis, including ethical practice recommendations that adapt the existing technical standards, problem formulation, design, and evaluation practice of MISyn to the ethical challenges; and oversight recommendations to facilitate checks and balances from stakeholders and the public. We also present two case studies that demonstrate how to apply the ethical practice recommendations in practice, and identify gaps between existing practice and the ethical practice recommendations.

IVNov 4, 2024
Disentangled PET Lesion Segmentation

Tanya Gatsak, Kumar Abhishek, Hanene Ben Yedder et al.

PET imaging is an invaluable tool in clinical settings as it captures the functional activity of both healthy anatomy and cancerous lesions. Developing automatic lesion segmentation methods for PET images is crucial since manual lesion segmentation is laborious and prone to inter- and intra-observer variability. We propose PET-Disentangler, a 3D disentanglement method that uses a 3D UNet-like encoder-decoder architecture to disentangle disease and normal healthy anatomical features with losses for segmentation, reconstruction, and healthy component plausibility. A critic network is used to encourage the healthy latent features to match the distribution of healthy samples and thus encourages these features to not contain any lesion-related features. Our quantitative results show that PET-Disentangler is less prone to incorrectly declaring healthy and high tracer uptake regions as cancerous lesions, since such uptake pattern would be assigned to the disentangled healthy component.

CVJan 25, 2024
Investigating the Quality of DermaMNIST and Fitzpatrick17k Dermatological Image Datasets

Kumar Abhishek, Aditi Jain, Ghassan Hamarneh

The remarkable progress of deep learning in dermatological tasks has brought us closer to achieving diagnostic accuracies comparable to those of human experts. However, while large datasets play a crucial role in the development of reliable deep neural network models, the quality of data therein and their correct usage are of paramount importance. Several factors can impact data quality, such as the presence of duplicates, data leakage across train-test partitions, mislabeled images, and the absence of a well-defined test partition. In this paper, we conduct meticulous analyses of three popular dermatological image datasets: DermaMNIST, its source HAM10000, and Fitzpatrick17k, uncovering these data quality issues, measure the effects of these problems on the benchmark results, and propose corrections to the datasets. Besides ensuring the reproducibility of our analysis, by making our analysis pipeline and the accompanying code publicly available, we aim to encourage similar explorations and to facilitate the identification and addressing of potential data quality issues in other large datasets.

LGJun 3, 2021
Sleeping Combinatorial Bandits

Kumar Abhishek, Ganesh Ghalme, Sujit Gujar et al.

In this paper, we study an interesting combination of sleeping and combinatorial stochastic bandits. In the mixed model studied here, at each discrete time instant, an arbitrary \emph{availability set} is generated from a fixed set of \emph{base} arms. An algorithm can select a subset of arms from the \emph{availability set} (sleeping bandits) and receive the corresponding reward along with semi-bandit feedback (combinatorial bandits). We adapt the well-known CUCB algorithm in the sleeping combinatorial bandits setting and refer to it as \CSUCB. We prove -- under mild smoothness conditions -- that the \CSUCB\ algorithm achieves an $O(\log (T))$ instance-dependent regret guarantee. We further prove that (i) when the range of the rewards is bounded, the regret guarantee of \CSUCB\ algorithm is $O(\sqrt{T \log (T)})$ and (ii) the instance-independent regret is $O(\sqrt[3]{T^2 \log(T)})$ in a general setting. Our results are quite general and hold under general environments -- such as non-additive reward functions, volatile arm availability, a variable number of base-arms to be pulled -- arising in practical applications. We validate the proven theoretical guarantees through experiments.

CVMay 2, 2021
Skin3D: Detection and Longitudinal Tracking of Pigmented Skin Lesions in 3D Total-Body Textured Meshes

Mengliu Zhao, Jeremy Kawahara, Kumar Abhishek et al.

We present an automated approach to detect and longitudinally track skin lesions on 3D total-body skin surface scans. The acquired 3D mesh of the subject is unwrapped to a 2D texture image, where a trained objected detection model, Faster R-CNN, localizes the lesions within the 2D domain. These detected skin lesions are mapped back to the 3D surface of the subject and, for subjects imaged multiple times, we construct a graph-based matching procedure to longitudinally track lesions that considers the anatomical correspondences among pairs of meshes and the geodesic proximity of corresponding lesions and the inter-lesion geodesic distances. We evaluated the proposed approach using 3DBodyTex, a publicly available dataset composed of 3D scans imaging the coloured skin (textured meshes) of 200 human subjects. We manually annotated locations that appeared to the human eye to contain a pigmented skin lesion as well as tracked a subset of lesions occurring on the same subject imaged in different poses. Our results, when compared to three human annotators, suggest that the trained Faster R-CNN detects lesions at a similar performance level as the human annotators. Our lesion tracking algorithm achieves an average matching accuracy of 88% on a set of detected corresponding pairs of prominent lesions of subjects imaged in different poses, and an average longitudinal accuracy of 71% when encompassing additional errors due to lesion detection. As there currently is no other large-scale publicly available dataset of 3D total-body skin lesions, we publicly release over 25,000 3DBodyTex manual annotations, which we hope will further research on total-body skin lesion analysis.

LGFeb 9, 2021
A Multi-Arm Bandit Approach To Subset Selection Under Constraints

Ayush Deva, Kumar Abhishek, Sujit Gujar

We explore the class of problems where a central planner needs to select a subset of agents, each with different quality and cost. The planner wants to maximize its utility while ensuring that the average quality of the selected agents is above a certain threshold. When the agents' quality is known, we formulate our problem as an integer linear program (ILP) and propose a deterministic algorithm, namely \dpss\ that provides an exact solution to our ILP. We then consider the setting when the qualities of the agents are unknown. We model this as a Multi-Arm Bandit (MAB) problem and propose \newalgo\ to learn the qualities over multiple rounds. We show that after a certain number of rounds, $τ$, \newalgo\ outputs a subset of agents that satisfy the average quality constraint with a high probability. Next, we provide bounds on $τ$ and prove that after $τ$ rounds, the algorithm incurs a regret of $O(\ln T)$, where $T$ is the total number of rounds. We further illustrate the efficacy of \newalgo\ through simulations. To overcome the computational limitations of \dpss, we propose a polynomial-time greedy algorithm, namely \greedy, that provides an approximate solution to our ILP. We also compare the performance of \dpss\ and \greedy\ through experiments.

IVDec 14, 2020
D-LEMA: Deep Learning Ensembles from Multiple Annotations -- Application to Skin Lesion Segmentation

Zahra Mirikharaji, Kumar Abhishek, Saeed Izadi et al.

Medical image segmentation annotations suffer from inter- and intra-observer variations even among experts due to intrinsic differences in human annotators and ambiguous boundaries. Leveraging a collection of annotators' opinions for an image is an interesting way of estimating a gold standard. Although training deep models in a supervised setting with a single annotation per image has been extensively studied, generalizing their training to work with datasets containing multiple annotations per image remains a fairly unexplored problem. In this paper, we propose an approach to handle annotators' disagreements when training a deep model. To this end, we propose an ensemble of Bayesian fully convolutional networks (FCNs) for the segmentation task by considering two major factors in the aggregation of multiple ground truth annotations: (1) handling contradictory annotations in the training data originating from inter-annotator disagreements and (2) improving confidence calibration through the fusion of base models' predictions. We demonstrate the superior performance of our approach on the ISIC Archive and explore the generalization performance of our proposed method by cross-dataset evaluation on the PH2 and DermoFit datasets.

CVMar 23, 2020
Illumination-based Transformations Improve Skin Lesion Segmentation in Dermoscopic Images

Kumar Abhishek, Ghassan Hamarneh, Mark S. Drew

The semantic segmentation of skin lesions is an important and common initial task in the computer aided diagnosis of dermoscopic images. Although deep learning-based approaches have considerably improved the segmentation accuracy, there is still room for improvement by addressing the major challenges, such as variations in lesion shape, size, color and varying levels of contrast. In this work, we propose the first deep semantic segmentation framework for dermoscopic images which incorporates, along with the original RGB images, information extracted using the physics of skin illumination and imaging. In particular, we incorporate information from specific color bands, illumination invariant grayscale images, and shading-attenuated images. We evaluate our method on three datasets: the ISBI ISIC 2017 Skin Lesion Segmentation Challenge dataset, the DermoFit Image Library, and the PH2 dataset and observe improvements of 12.02%, 4.30%, and 8.86% respectively in the mean Jaccard index over a baseline model trained only with RGB images.

GTFeb 26, 2020
Designing Truthful Contextual Multi-Armed Bandits based Sponsored Search Auctions

Kumar Abhishek, Shweta Jain, Sujit Gujar

For sponsored search auctions, we consider contextual multi-armed bandit problem in the presence of strategic agents. In this setting, at each round, an advertising platform (center) runs an auction to select the best-suited ads relevant to the query posted by the user. It is in the best interest of the center to select an ad that has a high expected value (i.e., probability of getting a click $\times$ value it derives from a click of the ad). The probability of getting a click (CTR) is unknown to the center and depends on the user's profile (context) posting the query. Further, the value derived for a click is the private information to the advertiser and thus needs to be elicited truthfully. The existing solution in this setting is not practical as it suffers from very high regret ($O(T^{\frac{2}{3}})$).

IVNov 28, 2019
Artificial Intelligence in Glioma Imaging: Challenges and Advances

Weina Jin, Mostafa Fatehi, Kumar Abhishek et al.

Primary brain tumors including gliomas continue to pose significant management challenges to clinicians. While the presentation, the pathology, and the clinical course of these lesions are variable, the initial investigations are usually similar. Patients who are suspected to have a brain tumor will be assessed with computed tomography (CT) and magnetic resonance imaging (MRI). The imaging findings are used by neurosurgeons to determine the feasibility of surgical resection and plan such an undertaking. Imaging studies are also an indispensable tool in tracking tumor progression or its response to treatment. As these imaging studies are non-invasive, relatively cheap and accessible to patients, there have been many efforts over the past two decades to increase the amount of clinically-relevant information that can be extracted from brain imaging. Most recently, artificial intelligence (AI) techniques have been employed to segment and characterize brain tumors, as well as to detect progression or treatment-response. However, the clinical utility of such endeavours remains limited due to challenges in data collection and annotation, model training, and the reliability of AI-generated information. We provide a review of recent advances in addressing the above challenges. First, to overcome the challenge of data paucity, different image imputation and synthesis techniques along with annotation collection efforts are summarized. Next, various training strategies are presented to meet multiple desiderata, such as model performance, generalization ability, data privacy protection, and learning with sparse annotations. Finally, standardized performance evaluation and model interpretability methods have been reviewed. We believe that these technical approaches will facilitate the development of a fully-functional AI tool in the clinical care of patients with gliomas.

CVNov 16, 2019
Signed Input Regularization

Saeid Asgari Taghanaki, Kumar Abhishek, Ghassan Hamarneh

Over-parameterized deep models usually over-fit to a given training distribution, which makes them sensitive to small changes and out-of-distribution samples at inference time, leading to low generalization performance. To this end, several model-based and randomized data-dependent regularization methods are applied, such as data augmentation, which prevents a model from memorizing the training distribution. Instead of the random transformation of the input images, we propose SIGN, a new regularization method, which modifies the input variables using a linear transformation by estimating each variable's contribution to the final prediction. Our proposed technique maps the input data to a new manifold where the less important variables are de-emphasized. To test the effectiveness of the proposed idea and compare it with other competing methods, we design several test scenarios, such as classification performance, uncertainty, out-of-distribution, and robustness analyses. We compare the methods using three different datasets and four models. We find that SIGN encourages more compact class representations, which results in the model's robustness to random corruptions and out-of-distribution samples while also simultaneously achieving superior performance on normal data compared to other competing methods. Our experiments also demonstrate the successful transferability of the SIGN samples from one model to another.

CVOct 16, 2019
Deep Semantic Segmentation of Natural and Medical Images: A Review

Saeid Asgari Taghanaki, Kumar Abhishek, Joseph Paul Cohen et al.

The semantic image segmentation task consists of classifying each pixel of an image into an instance, where each instance corresponds to a class. This task is a part of the concept of scene understanding or better explaining the global context of an image. In the medical image analysis domain, image segmentation can be used for image-guided interventions, radiotherapy, or improved radiological diagnostics. In this review, we categorize the leading deep learning-based medical and non-medical image segmentation solutions into six main groups of deep architectural, data synthesis-based, loss function-based, sequenced models, weakly supervised, and multi-task methods and provide a comprehensive review of the contributions in each of these groups. Further, for each group, we analyze each variant of these groups and discuss the limitations of the current approaches and present potential future research directions for semantic image segmentation.

PROct 4, 2019
Introduction to Concentration Inequalities

Kumar Abhishek, Sneha Maheshwari, Sujit Gujar

In this report, we aim to exemplify concentration inequalities and provide easy to understand proofs for it. Our focus is on the inequalities which are helpful in the design and analysis of machine learning algorithms.

IVJun 13, 2019
Mask2Lesion: Mask-Constrained Adversarial Skin Lesion Image Synthesis

Kumar Abhishek, Ghassan Hamarneh

Skin lesion segmentation is a vital task in skin cancer diagnosis and further treatment. Although deep learning based approaches have significantly improved the segmentation accuracy, these algorithms are still reliant on having a large enough dataset in order to achieve adequate results. Inspired by the immense success of generative adversarial networks (GANs), we propose a GAN-based augmentation of the original dataset in order to improve the segmentation performance. In particular, we use the segmentation masks available in the training dataset to train the Mask2Lesion model, and use the model to generate new lesion images given any arbitrary mask, which are then used to augment the original training dataset. We test Mask2Lesion augmentation on the ISBI ISIC 2017 Skin Lesion Segmentation Challenge dataset and achieve an improvement of 5.17% in the mean Dice score as compared to a model trained with only classical data augmentation techniques.

CVApr 4, 2019
Improved Inference via Deep Input Transfer

Saied Asgari Taghanaki, Kumar Abhishek, Ghassan Hamarneh

Although numerous improvements have been made in the field of image segmentation using convolutional neural networks, the majority of these improvements rely on training with larger datasets, model architecture modifications, novel loss functions, and better optimizers. In this paper, we propose a new segmentation performance boosting paradigm that relies on optimally modifying the network's input instead of the network itself. In particular, we leverage the gradients of a trained segmentation network with respect to the input to transfer it to a space where the segmentation accuracy improves. We test the proposed method on three publicly available medical image segmentation datasets: the ISIC 2017 Skin Lesion Segmentation dataset, the Shenzhen Chest X-Ray dataset, and the CVC-ColonDB dataset, for which our method achieves improvements of 5.8%, 0.5%, and 4.8% in the average Dice scores, respectively.

CVMar 3, 2019
A Kernelized Manifold Mapping to Diminish the Effect of Adversarial Perturbations

Saeid Asgari Taghanaki, Kumar Abhishek, Shekoofeh Azizi et al.

The linear and non-flexible nature of deep convolutional models makes them vulnerable to carefully crafted adversarial perturbations. To tackle this problem, we propose a non-linear radial basis convolutional feature mapping by learning a Mahalanobis-like distance function. Our method then maps the convolutional features onto a linearly well-separated manifold, which prevents small adversarial perturbations from forcing a sample to cross the decision boundary. We test the proposed method on three publicly available image classification and segmentation datasets namely, MNIST, ISBI ISIC 2017 skin lesion segmentation, and NIH Chest X-Ray-14. We evaluate the robustness of our method to different gradient (targeted and untargeted) and non-gradient based attacks and compare it to several non-gradient masking defense strategies. Our results demonstrate that the proposed method can increase the resilience of deep convolutional neural networks to adversarial perturbations without accuracy drop on clean data.

CVJan 12, 2019
Summarization and Visualization of Large Volumes of Broadcast Video Data

Kumar Abhishek, Ashok Yogi

Over the past few years, there has been an astounding growth in the number of news channels as well as the amount of broadcast news video data. As a result, it is imperative that automated methods need to be developed in order to effectively summarize and store this voluminous data. Format detection of news videos plays an important role in news video analysis. Our problem involves building a robust and versatile news format detector, which identifies the different band elements in a news frame. Probabilistic progressive Hough transform has been used for the detection of band edges. The detected bands are classified as natural images, computer generated graphics (non-text) and text bands. A contrast based text detector has been used to identify the text regions from news frames. Two classifers have been trained and evaluated for the labeling of the detected bands as natural or artificial - Support Vector Machine (SVM) Classifer with RBF kernel, and Extreme Learning Machine (ELM) classifier. The classifiers have been trained on a dataset of 6000 images (3000 images of each class). The ELM classifier reports a balanced accuracy of 77.38%, while the SVM classifier outperforms it with a balanced accuracy of 96.5% using 10-fold cross-validation. The detected bands which have been fragmented due to the presence of gradients in the image have been merged using a three-tier hierarchical reasoning model. The bands were detected with a Jaccard Index of 0.8138, when compared to manually marked ground truth data. We have also presented an extensive literature review of previous work done towards news videos format detection, element band classification, and associative reasoning.