Anurag Sharma

h-index13

6papers

190citations

Novelty48%

AI Score28

Ranked #150,429 of 194,257 authors (top 77%)#33,091 in LG (top 82%)

6 Papers

6.4LGFeb 12, 2024

Data Distribution-based Curriculum Learning

Shonal Chaudhry, Anuraganand Sharma

The order of training samples can have a significant impact on the performance of a classifier. Curriculum learning is a method of ordering training samples from easy to hard. This paper proposes the novel idea of a curriculum learning approach called Data Distribution-based Curriculum Learning (DDCL). DDCL uses the data distribution of a dataset to build a curriculum based on the order of samples. Two types of scoring methods known as DDCL (Density) and DDCL (Point) are used to score training samples thus determining their training order. DDCL (Density) uses the sample density to assign scores while DDCL (Point) utilises the Euclidean distance for scoring. We evaluate the proposed DDCL approach by conducting experiments on multiple datasets using a neural network, support vector machine and random forest classifier. Evaluation results show that the application of DDCL improves the average classification accuracy for all datasets compared to standard evaluation without any curriculum. Moreover, analysis of the error losses for a single training epoch reveals that convergence is faster when using DDCL over the no curriculum method.

14.6LGAug 6, 2021

SMOTified-GAN for class imbalanced pattern classification problems

Anuraganand Sharma, Prabhat Kumar Singh, Rohitash Chandra

Class imbalance in a dataset is a major problem for classifiers that results in poor prediction with a high true positive rate (TPR) but a low true negative rate (TNR) for a majority positive training dataset. Generally, the pre-processing technique of oversampling of minority class(es) are used to overcome this deficiency. Our focus is on using the hybridization of Generative Adversarial Network (GAN) and Synthetic Minority Over-Sampling Technique (SMOTE) to address class imbalanced problems. We propose a novel two-phase oversampling approach involving knowledge transfer that has the synergy of SMOTE and GAN. The unrealistic or overgeneralized samples of SMOTE are transformed into realistic distribution of data by GAN where there is not enough minority class data available for GAN to process them by itself effectively. We named it SMOTified-GAN as GAN works on pre-sampled minority data produced by SMOTE rather than randomly generating the samples itself. The experimental results prove the sample quality of minority class(es) has been improved in a variety of tested benchmark datasets. Its performance is improved by up to 9\% from the next best algorithm tested on F1-score measurements. Its time complexity is also reasonable which is around $O(N^2d^2T)$ for a sequential algorithm.

3.1LGJan 17, 2021

Guided parallelized stochastic gradient descent for delay compensation

Anuraganand Sharma

Stochastic gradient descent (SGD) algorithm and its variations have been effectively used to optimize neural network models. However, with the rapid growth of big data and deep learning, SGD is no longer the most suitable choice due to its natural behavior of sequential optimization of the error function. This has led to the development of parallel SGD algorithms, such as asynchronous SGD (ASGD) and synchronous SGD (SSGD) to train deep neural networks. However, it introduces a high variance due to the delay in parameter (weight) update. We address this delay in our proposed algorithm and try to minimize its impact. We employed guided SGD (gSGD) that encourages consistent examples to steer the convergence by compensating the unpredictable deviation caused by the delay. Its convergence rate is also similar to A/SSGD, however, some additional (parallel) processing is required to compensate for the delay. The experimental results demonstrate that our proposed approach has been able to mitigate the impact of delay for the quality of classification accuracy. The guided approach with SSGD clearly outperforms sequential SGD and even achieves the accuracy close to sequential SGD for some benchmark datasets.

5.0CVJul 7, 2020Code

Classification with 2-D Convolutional Neural Networks for breast cancer diagnosis

Anuraganand Sharma, Dinesh Kumar

Breast cancer is the most common cancer in women. Classification of cancer/non-cancer patients with clinical records requires high sensitivity and specificity for an acceptable diagnosis test. The state-of-the-art classification model - Convolutional Neural Network (CNN), however, cannot be used with clinical data that are represented in 1-D format. CNN has been designed to work on a set of 2-D matrices whose elements show some correlation with neighboring elements such as in image data. Conversely, the data examples represented as a set of 1-D vectors -- apart from the time series data -- cannot be used with CNN, but with other classification models such as Artificial Neural Networks or RandomForest. We have proposed some novel preprocessing methods of data wrangling that transform a 1-D data vector, to a 2-D graphical image with appropriate correlations among the fields to be processed on CNN. We tested our methods on Wisconsin Original Breast Cancer (WBC) and Wisconsin Diagnostic Breast Cancer (WDBC) datasets. To our knowledge, this work is novel on non-image to image data transformation for the non-time series data. The transformed data processed with CNN using VGGnet-16 shows competitive results for the WBC dataset and outperforms other known methods for the WDBC dataset.

2.4NEFeb 8, 2020

A Constraint Driven Solution Model for Discrete Domains with a Case Study of Exam Timetabling Problems

Anuraganand Sharma

Many science and engineering applications require finding solutions to planning and optimization problems by satisfying a set of constraints. These constraint problems (CPs) are typically NP-complete and can be formalized as constraint satisfaction problems (CSPs) or constraint optimization problems (COPs). Evolutionary algorithms (EAs) are good solvers for optimization problems ubiquitous in various problem domains, however traditional operators for EAs are 'blind' to constraints or generally use problem dependent objective functions; as they do not exploit information from the constraints in search for solutions. A variation of EA, Intelligent constraint handling evolutionary algorithm (ICHEA), has been demonstrated to be a versatile constraints-guided EA for continuous constrained problems in our earlier works in (Sharma and Sharma, 2012) where it extracts information from constraints and exploits it in the evolutionary search to make the search more efficient. In this paper ICHEA has been demonstrated to solve benchmark exam timetabling problems, a classic COP. The presented approach demonstrates competitive results with other state-of-the-art approaches in EAs in terms of quality of solutions. ICHEA first uses its inter-marriage crossover operator to satisfy all the given constraints incrementally and then uses combination of traditional and enhanced operators to optimize the solution. Generally CPs solved by EAs are problem dependent penalty based fitness functions. We also proposed a generic preference based solution model that does not require a problem dependent fitness function, however currently it only works for mutually exclusive constraints.

4.3LGAug 22, 2017

Stacked transfer learning for tropical cyclone intensity prediction

Ratneel Vikash Deo, Rohitash Chandra, Anuraganand Sharma

Tropical cyclone wind-intensity prediction is a challenging task considering drastic changes climate patterns over the last few decades. In order to develop robust prediction models, one needs to consider different characteristics of cyclones in terms of spatial and temporal characteristics. Transfer learning incorporates knowledge from a related source dataset to compliment a target datasets especially in cases where there is lack or data. Stacking is a form of ensemble learning focused for improving generalization that has been recently used for transfer learning problems which is referred to as transfer stacking. In this paper, we employ transfer stacking as a means of studying the effects of cyclones whereby we evaluate if cyclones in different geographic locations can be helpful for improving generalization performs. Moreover, we use conventional neural networks for evaluating the effects of duration on cyclones in prediction performance. Therefore, we develop an effective strategy that evaluates the relationships between different types of cyclones through transfer learning and conventional learning methods via neural networks.