Vadlamani Ravi

LG
25papers
590citations
Novelty33%
AI Score23

25 Papers

LGNov 19, 2022
Explainable Artificial Intelligence and Causal Inference based ATM Fraud Detection

Yelleti Vivek, Vadlamani Ravi, Abhay Anand Mane et al.

Gaining the trust of customers and providing them empathy are very critical in the financial domain. Frequent occurrence of fraudulent activities affects these two factors. Hence, financial organizations and banks must take utmost care to mitigate them. Among them, ATM fraudulent transaction is a common problem faced by banks. There following are the critical challenges involved in fraud datasets: the dataset is highly imbalanced, the fraud pattern is changing, etc. Owing to the rarity of fraudulent activities, Fraud detection can be formulated as either a binary classification problem or One class classification (OCC). In this study, we handled these techniques on an ATM transactions dataset collected from India. In binary classification, we investigated the effectiveness of various over-sampling techniques, such as the Synthetic Minority Oversampling Technique (SMOTE) and its variants, Generative Adversarial Networks (GAN), to achieve oversampling. Further, we employed various machine learning techniques viz., Naive Bayes (NB), Logistic Regression (LR), Support Vector Machine (SVM), Decision Tree (DT), Random Forest (RF), Gradient Boosting Tree (GBT), Multi-layer perceptron (MLP). GBT outperformed the rest of the models by achieving 0.963 AUC, and DT stands second with 0.958 AUC. DT is the winner if the complexity and interpretability aspects are considered. Among all the oversampling approaches, SMOTE and its variants were observed to perform better. In OCC, IForest attained 0.959 CR, and OCSVM secured second place with 0.947 CR. Further, we incorporated explainable artificial intelligence (XAI) and causal inference (CI) in the fraud detection framework and studied it through various analyses.

AIAug 18, 2022
Explainable Reinforcement Learning on Financial Stock Trading using SHAP

Satyam Kumar, Mendhikar Vishal, Vadlamani Ravi

Explainable Artificial Intelligence (XAI) research gained prominence in recent years in response to the demand for greater transparency and trust in AI from the user communities. This is especially critical because AI is adopted in sensitive fields such as finance, medicine etc., where implications for society, ethics, and safety are immense. Following thorough systematic evaluations, work in XAI has primarily focused on Machine Learning (ML) for categorization, decision, or action. To the best of our knowledge, no work is reported that offers an Explainable Reinforcement Learning (XRL) method for trading financial stocks. In this paper, we proposed to employ SHapley Additive exPlanation (SHAP) on a popular deep reinforcement learning architecture viz., deep Q network (DQN) to explain an action of an agent at a given instance in financial stock trading. To demonstrate the effectiveness of our method, we tested it on two popular datasets namely, SENSEX and DJIA, and reported the results.

AIJul 31, 2023
Causal Inference for Banking Finance and Insurance A Survey

Satyam Kumar, Yelleti Vivek, Vadlamani Ravi et al.

Causal Inference plays an significant role in explaining the decisions taken by statistical models and artificial intelligence models. Of late, this field started attracting the attention of researchers and practitioners alike. This paper presents a comprehensive survey of 37 papers published during 1992-2023 and concerning the application of causal inference to banking, finance, and insurance. The papers are categorized according to the following families of domains: (i) Banking, (ii) Finance and its subdomains such as corporate finance, governance finance including financial risk and financial policy, financial economics, and Behavioral finance, and (iii) Insurance. Further, the paper covers the primary ingredients of causal inference namely, statistical methods such as Bayesian Causal Network, Granger Causality and jargon used thereof such as counterfactuals. The review also recommends some important directions for future research. In conclusion, we observed that the application of causal inference in the banking and insurance sectors is still in its infancy, and thus more research is possible to turn it into a viable method.

LGMar 8, 2023
ATM Fraud Detection using Streaming Data Analytics

Yelleti Vivek, Vadlamani Ravi, Abhay Anand Mane et al.

Gaining the trust and confidence of customers is the essence of the growth and success of financial institutions and organizations. Of late, the financial industry is significantly impacted by numerous instances of fraudulent activities. Further, owing to the generation of large voluminous datasets, it is highly essential that underlying framework is scalable and meet real time needs. To address this issue, in the study, we proposed ATM fraud detection in static and streaming contexts respectively. In the static context, we investigated a parallel and scalable machine learning algorithms for ATM fraud detection that is built on Spark and trained with a variety of machine learning (ML) models including Naive Bayes (NB), Logistic Regression (LR), Support Vector Machine (SVM), Decision Tree (DT), Random Forest (RF), Gradient Boosting Tree (GBT), and Multi-layer perceptron (MLP). We also employed several balancing techniques like Synthetic Minority Oversampling Technique (SMOTE) and its variants, Generative Adversarial Networks (GAN), to address the rarity in the dataset. In addition, we proposed a streaming based ATM fraud detection in the streaming context. Our sliding window based method collects ATM transactions that are performed within a specified time interval and then utilizes to train several ML models, including NB, RF, DT, and K-Nearest Neighbour (KNN). We selected these models based on their less model complexity and quicker response time. In both contexts, RF turned out to be the best model. RF obtained the best mean AUC of 0.975 in the static context and mean AUC of 0.910 in the streaming context. RF is also empirically proven to be statistically significant than the next-best performing models.

LGFeb 25, 2023
Chaotic Variational Auto encoder-based Adversarial Machine Learning

Pavan Venkata Sainadh Reddy, Yelleti Vivek, Gopi Pranay et al.

Machine Learning (ML) has become the new contrivance in almost every field. This makes them a target of fraudsters by various adversary attacks, thereby hindering the performance of ML models. Evasion and Data-Poison-based attacks are well acclaimed, especially in finance, healthcare, etc. This motivated us to propose a novel computationally less expensive attack mechanism based on the adversarial sample generation by Variational Auto Encoder (VAE). It is well known that Wavelet Neural Network (WNN) is considered computationally efficient in solving image and audio processing, speech recognition, and time-series forecasting. This paper proposed VAE-Deep-Wavelet Neural Network (VAE-Deep-WNN), where Encoder and Decoder employ WNN networks. Further, we proposed chaotic variants of both VAE with Multi-layer perceptron (MLP) and Deep-WNN and named them C-VAE-MLP and C-VAE-Deep-WNN, respectively. Here, we employed a Logistic map to generate random noise in the latent space. In this paper, we performed VAE-based adversary sample generation and applied it to various problems related to finance and cybersecurity domain-related problems such as loan default, credit card fraud, and churn modelling, etc., We performed both Evasion and Data-Poison attacks on Logistic Regression (LR) and Decision Tree (DT) models. The results indicated that VAE-Deep-WNN outperformed the rest in the majority of the datasets and models. However, its chaotic variant C-VAE-Deep-WNN performed almost similarly to VAE-Deep-WNN in the majority of the datasets.

LGDec 15, 2022
Chaotic Variational Auto Encoder based One Class Classifier for Insurance Fraud Detection

K. S. N. V. K. Gangadhar, B. Akhil Kumar, Yelleti Vivek et al.

Of late, insurance fraud detection has assumed immense significance owing to the huge financial & reputational losses fraud entails and the phenomenal success of the fraud detection techniques. Insurance is majorly divided into two categories: (i) Life and (ii) Non-life. Non-life insurance in turn includes health insurance and auto insurance among other things. In either of the categories, the fraud detection techniques should be designed in such a way that they capture as many fraudulent transactions as possible. Owing to the rarity of fraudulent transactions, in this paper, we propose a chaotic variational autoencoder (C-VAE to perform one-class classification (OCC) on genuine transactions. Here, we employed the logistic chaotic map to generate random noise in the latent space. The effectiveness of C-VAE is demonstrated on the health insurance fraud and auto insurance datasets. We considered vanilla Variational Auto Encoder (VAE) as the baseline. It is observed that C-VAE outperformed VAE in both datasets. C-VAE achieved a classification rate of 77.9% and 87.25% in health and automobile insurance datasets respectively. Further, the t-test conducted at 1% level of significance and 18 degrees of freedom infers that C-VAE is statistically significant than the VAE.

LGAug 19, 2022
Application of Causal Inference to Analytical Customer Relationship Management in Banking and Insurance

Satyam Kumar, Vadlamani Ravi

Of late, in order to have better acceptability among various domain, researchers have argued that machine intelligence algorithms must be able to provide explanations that humans can understand causally. This aspect, also known as causability, achieves a specific level of human-level explainability. A specific class of algorithms known as counterfactuals may be able to provide causability. In statistics, causality has been studied and applied for many years, but not in great detail in artificial intelligence (AI). In a first-of-its-kind study, we employed the principles of causal inference to provide explainability for solving the analytical customer relationship management (ACRM) problems. In the context of banking and insurance, current research on interpretability tries to address causality-related questions like why did this model make such decisions, and was the model's choice influenced by a particular factor? We propose a solution in the form of an intervention, wherein the effect of changing the distribution of features of ACRM datasets is studied on the target feature. Subsequently, a set of counterfactuals is also obtained that may be furnished to any customer who demands an explanation of the decision taken by the bank/insurance company. Except for the credit card churn prediction dataset, good quality counterfactuals were generated for the loan default, insurance fraud detection, and credit card fraud detection datasets, where changes in no more than three features are observed.

LGJul 18, 2022
Explainable Deep Belief Network based Auto encoder using novel Extended Garson Algorithm

Satyam Kumar, Vadlamani Ravi

The most difficult task in machine learning is to interpret trained shallow neural networks. Deep neural networks (DNNs) provide impressive results on a larger number of tasks, but it is generally still unclear how decisions are made by such a trained deep neural network. Providing feature importance is the most important and popular interpretation technique used in shallow and deep neural networks. In this paper, we develop an algorithm extending the idea of Garson Algorithm to explain Deep Belief Network based Auto-encoder (DBNA). It is used to determine the contribution of each input feature in the DBN. It can be used for any kind of neural network with many hidden layers. The effectiveness of this method is tested on both classification and regression datasets taken from literature. Important features identified by this method are compared against those obtained by Wald chi square (\c{hi}2). For 2 out of 4 classification datasets and 2 out of 5 regression datasets, our proposed methodology resulted in the identification of better-quality features leading to statistically more significant results vis-à-vis Wald \c{hi}2.

LGMay 26, 2022
Privacy-Preserving Wavelet Neural Network with Fully Homomorphic Encryption

Syed Imtiaz Ahamed, Vadlamani Ravi

The main aim of Privacy-Preserving Machine Learning (PPML) is to protect the privacy and provide security to the data used in building Machine Learning models. There are various techniques in PPML such as Secure Multi-Party Computation, Differential Privacy, and Homomorphic Encryption (HE). The techniques are combined with various Machine Learning models and even Deep Learning Networks to protect the data privacy as well as the identity of the user. In this paper, we propose a fully homomorphic encrypted wavelet neural network to protect privacy and at the same time not compromise on the efficiency of the model. We tested the effectiveness of the proposed method on seven datasets taken from the finance and healthcare domains. The results show that our proposed model performs similarly to the unencrypted model.

LGApr 9, 2023
FedPNN: One-shot Federated Classification via Evolving Clustering Method and Probabilistic Neural Network hybrid

Polaki Durga Prasad, Yelleti Vivek, Vadlamani Ravi

Protecting data privacy is paramount in the fields such as finance, banking, and healthcare. Federated Learning (FL) has attracted widespread attention due to its decentralized, distributed training and the ability to protect the privacy while obtaining a global shared model. However, FL presents challenges such as communication overhead, and limited resource capability. This motivated us to propose a two-stage federated learning approach toward the objective of privacy protection, which is a first-of-its-kind study as follows: (i) During the first stage, the synthetic dataset is generated by employing two different distributions as noise to the vanilla conditional tabular generative adversarial neural network (CTGAN) resulting in modified CTGAN, and (ii) In the second stage, the Federated Probabilistic Neural Network (FedPNN) is developed and employed for building globally shared classification model. We also employed synthetic dataset metrics to check the quality of the generated synthetic dataset. Further, we proposed a meta-clustering algorithm whereby the cluster centers obtained from the clients are clustered at the server for training the global model. Despite PNN being a one-pass learning classifier, its complexity depends on the training data size. Therefore, we employed a modified evolving clustering method (ECM), another one-pass algorithm to cluster the training data thereby increasing the speed further. Moreover, we conducted sensitivity analysis by varying Dthr, a hyperparameter of ECM at the server and client, one at a time. The effectiveness of our approach is validated on four finance and medical datasets.

LGAug 4, 2022
Privacy-Preserving Chaotic Extreme Learning Machine with Fully Homomorphic Encryption

Syed Imtiaz Ahamed, Vadlamani Ravi

The Machine Learning and Deep Learning Models require a lot of data for the training process, and in some scenarios, there might be some sensitive data, such as customer information involved, which the organizations might be hesitant to outsource for model building. Some of the privacy-preserving techniques such as Differential Privacy, Homomorphic Encryption, and Secure Multi-Party Computation can be integrated with different Machine Learning and Deep Learning algorithms to provide security to the data as well as the model. In this paper, we propose a Chaotic Extreme Learning Machine and its encrypted form using Fully Homomorphic Encryption where the weights and biases are generated using a logistic map instead of uniform distribution. Our proposed method has performed either better or similar to the Traditional Extreme Learning Machine on most of the datasets.

LGFeb 23, 2022
Nowcasting the Financial Time Series with Streaming Data Analytics under Apache Spark

Mohammad Arafat Ali Khan, Chandra Bhushan, Vadlamani Ravi et al.

This paper proposes nowcasting of high-frequency financial datasets in real-time with a 5-minute interval using the streaming analytics feature of Apache Spark. The proposed 2 stage method consists of modelling chaos in the first stage and then using a sliding window approach for training with machine learning algorithms namely Lasso Regression, Ridge Regression, Generalised Linear Model, Gradient Boosting Tree and Random Forest available in the MLLib of Apache Spark in the second stage. For testing the effectiveness of the proposed methodology, 3 different datasets, of which two are stock markets namely National Stock Exchange & Bombay Stock Exchange, and finally One Bitcoin-INR conversion dataset. For evaluating the proposed methodology, we used metrics such as Symmetric Mean Absolute Percentage Error, Directional Symmetry, and Theil U Coefficient. We tested the significance of each pair of models using the Diebold Mariano (DM) test.

NEFeb 8, 2022
Feature subset selection for Big Data via Chaotic Binary Differential Evolution under Apache Spark

Yelleti Vivek, Vadlamani Ravi, P. Radhakrishna

Feature subset selection (FSS) using a wrapper approach is essentially a combinatorial optimization problem having two objective functions namely cardinality of the selected-feature-subset, which should be minimized and the corresponding area under the ROC curve (AUC) to be maximized. In this research study, we propose a novel multiplicative single objective function involving cardinality and AUC. The randomness involved in the Binary Differential Evolution (BDE) may yield less diverse solutions thereby getting trapped in local minima. Hence, we embed Logistic and Tent chaotic maps into the BDE and named it as Chaotic Binary Differential Evolution (CBDE). Designing a scalable solution to the FSS is critical when dealing with high-dimensional and voluminous datasets. Hence, we propose a scalable island (iS) based parallelization approach where the data is divided into multiple partitions/islands thereby the solution evolves individually and gets combined eventually in a migration strategy. The results empirically show that the proposed parallel Chaotic Binary Differential Evolution (P-CBDE-iS) is able to find the better quality feature subsets than the Parallel Bi-nary Differential Evolution (P-BDE-iS). Logistic Regression (LR) is used as a classifier owing to its simplicity and power. The speedup attained by the proposed parallel approach signifies the importance.

LGJan 27, 2022
FinGAN: Generative Adversarial Network for Analytical Customer Relationship Management in Banking and Insurance

Prateek Kate, Vadlamani Ravi, Akhilesh Gangwar

Churn prediction in credit cards, fraud detection in insurance, and loan default prediction are important analytical customer relationship management (ACRM) problems. Since frauds, churns and defaults happen less frequently, the datasets for these problems turn out to be naturally highly unbalanced. Consequently, all supervised machine learning classifiers tend to yield substantial false-positive rates when trained on such unbalanced datasets. We propose two ways of data balancing. In the first, we propose an oversampling method to generate synthetic samples of minority class using Generative Adversarial Network (GAN). We employ Vanilla GAN [1], Wasserstein GAN [2] and CTGAN [3] separately to oversample the minority class samples. In order to assess the efficacy of our proposed approach, we use a host of machine learning classifiers, including Random Forest, Decision Tree, support vector machine (SVM), and Logistic Regression on the data balanced by GANs. In the second method, we introduce a hybrid method to handle data imbalance. In this second way, we utilize the power of undersampling and over-sampling together by augmenting the synthetic minority class data oversampled by GAN with the undersampled majority class data obtained by one-class support vigor machine (OCSVM) [4]. We combine both over-sampled data generated by GAN and the data under-sampled by OCSVM [4] and pass the resultant data to classifiers. When we compared our results to those of Farquad et al. [5], Sundarkumar, Ravi, and Siddeshwar [6], our proposed methods outperform the previous results in terms of the area under the ROC curve (AUC) on all datasets.

NENov 26, 2021
Optimal Technical Indicator-based Trading Strategies Using NSGA-II

P. Shanmukh Kali Prasad, Vadlamani Madhav, Ramanuj Lal et al.

This paper proposes non-dominated sorting genetic algorithm-II (NSGA-II ) in the context of technical indicator-based stock trading, by finding optimal combinations of technical indicators to generate buy and sell strategies such that the objectives, namely, Sharpe ratio and Maximum Drawdown are maximized and minimized respectively. NSGA-II is chosen because it is a very popular and powerful bi-objective evolutionary algorithm. The training and testing used a rolling-based approach (two years training and a year for testing) and thus the results of the approach seem to be considerably better in stable periods without major economic fluctuations. Further, another important contribution of this study is to incorporate the transaction cost and domain expertise in the whole modeling approach.

NEJun 26, 2021
Scalable Feature Subset Selection for Big Data using Parallel Hybrid Evolutionary Algorithm based Wrapper in Apache Spark

Yelleti Vivek, Vadlamani Ravi, Pisipati Radhakrishna

Owing to the emergence of large datasets, applying current sequential wrapper-based feature subset selection (FSS) algorithms increases the complexity. This limitation motivated us to propose a wrapper for feature subset selection (FSS) based on parallel and distributed hybrid evolutionary algorithms (EAs) under the Apache Spark environment. The hybrid EAs are based on the BDE and Binary Threshold Accepting (BTA), a point-based EA, which is invoked to enhance the search capability and avoid premature convergence of the PB-DE. Thus, we designed the hybrid variants (i) parallel binary differential evolution and threshold accepting (PB-DETA), where DE and TA work in tandem in every iteration, and (ii) parallel binary threshold accepting and differential evolution (PB-TADE), where TA and DE work in tandem in every iteration under the Apache Spark environment. Both PB-DETA and PB-TADE are compared with the baseline, viz., the parallel version of the binary differential evolution (PB-DE). All three proposed approaches use logistic regression (LR) to compute the fitness function, namely, the area under ROC curve (AUC). The effectiveness of the proposed algorithms is tested over the five large datasets of varying feature space dimension, taken from cyber security and biology domains. It is noteworthy that the PB-TADE turned out to be statistically significant compared to PB-DE and PB-DETA. We reported the speedup analysis, average AUC obtained by the most repeated feature subset, feature subset with high AUC and least cardinality.

NEFeb 23, 2021
Optimal Prediction Intervals for Macroeconomic Time Series Using Chaos and NSGA II

Vangala Sarveswararao, Vadlamani Ravi, Sheik Tanveer Ul Huq

In a first-of-its-kind study, this paper proposes the formulation of constructing prediction intervals (PIs) in a time series as a bi-objective optimization problem and solves it with the help of Nondominated Sorting Genetic Algorithm (NSGA-II). We also proposed modeling the chaos present in the time series as a preprocessor in order to model the deterministic uncertainty present in the time series. Even though the proposed models are general in purpose, they are used here for quantifying the uncertainty in macroeconomic time series forecasting. Ideal PIs should be as narrow as possible while capturing most of the data points. Based on these two objectives, we formulated a bi-objective optimization problem to generate PIs in 2-stages, wherein reconstructing the phase space using Chaos theory (stage-1) is followed by generating optimal point prediction using NSGA-II and these point predictions are in turn used to obtain PIs (stage-2). We also proposed a 3-stage hybrid, wherein the 3rd stage invokes NSGA-II too in order to solve the problem of constructing PIs from the point prediction obtained in 2nd stage. The proposed models when applied to the macroeconomic time series, yielded better results in terms of both prediction interval coverage probability (PICP) and prediction interval average width (PIAW) compared to the state-of-the-art Lower Upper Bound Estimation Method (LUBE) with Gradient Descent (GD). The 3-stage model yielded better PICP compared to the 2-stage model but showed similar performance in PIAW with added computation cost of running NSGA-II second time.

CLJul 2, 2020
A Novel BGCapsule Network for Text Classification

Akhilesh Kumar Gangwar, Vadlamani Ravi

Several text classification tasks such as sentiment analysis, news categorization, multi-label classification and opinion classification are challenging problems even for modern deep learning networks. Recently, Capsule Networks (CapsNets) are proposed for image classification. It has been shown that CapsNets have several advantages over Convolutional Neural Networks (CNNs), while their validity in the domain of text has been less explored. In this paper, we propose a novel hybrid architecture viz., BGCapsule, which is a Capsule model preceded by an ensemble of Bidirectional Gated Recurrent Units (BiGRU) for several text classification tasks. We employed an ensemble of Bidirectional GRUs for feature extraction layer preceding the primary capsule layer. The hybrid architecture, after performing basic pre-processing steps, consists of five layers: an embedding layer based on GloVe, a BiGRU based ensemble layer, a primary capsule layer, a flatten layer and fully connected ReLU layer followed by a fully connected softmax layer. In order to evaluate the effectiveness of BGCapsule, we conducted extensive experiments on five benchmark datasets (ranging from 10,000 records to 700,000 records) including Movie Review (MR Imdb 2005), AG News dataset, Dbpedia ontology dataset, Yelp Review Full dataset and Yelp review polarity dataset. These benchmarks cover several text classification tasks such as news categorization, sentiment analysis, multiclass classification, multi-label classification and opinion classification. We found that our proposed architecture (BGCapsule) achieves better accuracy compared to the existing methods without the help of any external linguistic knowledge such as positive sentiment keywords and negative sentiment keywords. Further, BGCapsule converged faster compared to other extant techniques.

CLMay 10, 2020
Article citation study: Context enhanced citation sentiment detection

Vishal Vyas, Kumar Ravi, Vadlamani Ravi et al.

Citation sentimet analysis is one of the little studied tasks for scientometric analysis. For citation analysis, we developed eight datasets comprising citation sentences, which are manually annotated by us into three sentiment polarities viz. positive, negative, and neutral. Among eight datasets, three were developed by considering the whole context of citations. Furthermore, we proposed an ensembled feature engineering method comprising word embeddings obtained for texts, parts-of-speech tags, and dependency relationships together. Ensembled features were considered as input to deep learning based approaches for citation sentiment classification, which is in turn compared with Bag-of-Words approach. Experimental results demonstrate that deep learning is useful for higher number of samples, whereas support vector machine is the winner for smaller number of samples. Moreover, context-based samples are proved to be more effective than context-less samples for citation sentiment analysis.

NEMay 7, 2020
Evolutionary Multi Objective Optimization Algorithm for Community Detection in Complex Social Networks

Shaik Tanveer ul Huq, Vadlamani Ravi, Kalyanmoy Deb

Most optimization-based community detection approaches formulate the problem in a single or bi-objective framework. In this paper, we propose two variants of a three-objective formulation using a customized non-dominated sorting genetic algorithm III (NSGA-III) to find community structures in a network. In the first variant, named NSGA-III-KRM, we considered Kernel k means, Ratio cut, and Modularity, as the three objectives, whereas the second variant, named NSGA-III-CCM, considers Community score, Community fitness and Modularity, as three objective functions. Experiments are conducted on four benchmark network datasets. Comparison with state-of-the-art approaches along with decomposition-based multi-objective evolutionary algorithm variants (MOEA/D-KRM and MOEA/D-CCM) indicates that the proposed variants yield comparable or better results. This is particularly significant because the addition of the third objective does not worsen the results of the other two objectives. We also propose a simple method to rank the Pareto solutions so obtained by proposing a new measure, namely the ratio of the hyper-volume and inverted generational distance (IGD). The higher the ratio, the better is the Pareto set. This strategy is particularly useful in the absence of empirical attainment function in the multi-objective framework, where the number of objectives is more than two.

NEMar 20, 2020
Evolutionary Multi-Objective Optimization Framework for Mining Association Rules

Shaik Tanveer Ul Huq, Vadlamani Ravi

In this paper, two multi-objective optimization frameworks in two variants (i.e., NSGA-III-ARM-V1, NSGA-III-ARM-V2; and MOEAD-ARM-V1, MOEAD-ARM-V2) are proposed to find association rules from transactional datasets. The first framework uses Non-dominated sorting genetic algorithm III (NSGA-III) and the second uses Decomposition based multi-objective evolutionary algorithm (MOEA/D) to find the association rules which are diverse, non-redundant and non-dominated (having high objective function values). In both these frameworks, there is no need to specify minimum support and minimum confidence. In the first variant, support, confidence, and lift are considered as objective functions while in second, confidence, lift, and interestingness are considered as objective functions. These frameworks are tested on seven different kinds of datasets including two real-life bank datasets. Our study suggests that NSGA-III-ARM framework works better than MOEAD-ARM framework in both the variants across majority of the datasets.

STNov 7, 2019
Predicting Indian stock market using the psycho-linguistic features of financial news

B. Shravan Kumar, Vadlamani Ravi, Rishabh Miglani

Financial forecasting using news articles is an emerging field. In this paper, we proposed hybrid intelligent models for stock market prediction using the psycholinguistic variables (LIWC and TAALES) extracted from news articles as predictor variables. For prediction purpose, we employed various intelligent techniques such as Multilayer Perceptron (MLP), Group Method of Data Handling (GMDH), General Regression Neural Network (GRNN), Random Forest (RF), Quantile Regression Random Forest (QRRF), Classification and regression tree (CART) and Support Vector Regression (SVR). We experimented on the data of 12 companies stocks, which are listed in the Bombay Stock Exchange (BSE). We employed chi-squared and maximum relevance and minimum redundancy (MRMR) feature selection techniques on the psycho-linguistic features obtained from the new articles etc. After extensive experimentation, using the Diebold-Mariano test, we conclude that GMDH and GRNN are statistically the best techniques in that order with respect to the MAPE and NRMSE values.

NEOct 9, 2019
Large Scale Global Optimization by Hybrid Evolutionary Computation

Gutha Jaya Krishna, Vadlamani Ravi

In management, business, economics, science, engineering, and research domains, Large Scale Global Optimization (LSGO) plays a predominant and vital role. Though LSGO is applied in many of the application domains, it is a very troublesome and a perverse task. The Congress on Evolutionary Computation (CEC) began an LSGO competition to come up with algorithms with a bunch of standard benchmark unconstrained LSGO functions. Therefore, in this paper, we propose a hybrid meta-heuristic algorithm, which combines an Improved and Modified Harmony Search (IMHS), along with a Modified Differential Evolution (MDE) with an alternate selection strategy. Harmony Search (HS) does the job of exploration and exploitation, and Differential Evolution does the job of giving a perturbation to the exploration of IMHS, as harmony search suffers from being stuck at the basin of local optimal. To judge the performance of the suggested algorithm, we compare the proposed algorithm with ten excellent meta-heuristic algorithms on fifteen LSGO benchmark functions, which have 1000 continuous decision variables, of the CEC 2013 LSGO special session. The experimental results consistently show that our proposed hybrid meta-heuristic performs statistically on par with some algorithms in a few problems, while it turned out to be the best in a couple of problems.

LGMay 30, 2019
A Review of Deep Learning with Special Emphasis on Architectures, Applications and Recent Trends

Saptarshi Sengupta, Sanchita Basak, Pallabi Saikia et al.

Deep learning has solved a problem that as little as five years ago was thought by many to be intractable - the automatic recognition of patterns in data; and it can do so with accuracy that often surpasses human beings. It has solved problems beyond the realm of traditional, hand-crafted machine learning algorithms and captured the imagination of practitioners trying to make sense out of the flood of data that now inundates our society. As public awareness of the efficacy of DL increases so does the desire to make use of it. But even for highly trained professionals it can be daunting to approach the rapidly increasing body of knowledge produced by experts in the field. Where does one start? How does one determine if a particular model is applicable to their problem? How does one train and deploy such a network? A primer on the subject can be a good place to start. With that in mind, we present an overview of some of the key multilayer ANNs that comprise DL. We also discuss some new automatic architecture optimization protocols that use multi-agent approaches. Further, since guaranteeing system uptime is becoming critical to many computer applications, we include a section on using neural networks for fault detection and subsequent mitigation. This is followed by an exploratory survey of several application areas where DL has emerged as a game-changing technology: anomalous behavior detection in financial applications or in financial time-series forecasting, predictive and prescriptive analytics, medical image processing and analysis and power systems research. The thrust of this review is to outline emerging areas of application-oriented research within the DL community as well as to provide a reference to researchers seeking to use it in their work for what it does best: statistical pattern recognition with unparalleled learning capacity with the ability to scale with information.

NEMay 23, 2019
CUDA-Self-Organizing feature map based visual sentiment analysis of bank customer complaints for Analytical CRM

Rohit Gavval, Vadlamani Ravi, Kalavala Revanth Harshal et al.

With the widespread use of social media, companies now have access to a wealth of customer feedback data which has valuable applications to Customer Relationship Management (CRM). Analyzing customer grievances data, is paramount as their speedy non-redressal would lead to customer churn resulting in lower profitability. In this paper, we propose a descriptive analytics framework using Self-organizing feature map (SOM), for Visual Sentiment Analysis of customer complaints. The network learns the inherent grouping of the complaints automatically which can then be visualized too using various techniques. Analytical Customer Relationship Management (ACRM) executives can draw useful business insights from the maps and take timely remedial action. We also propose a high-performance version of the algorithm CUDASOM (CUDA based Self Organizing feature Map) implemented using NVIDIA parallel computing platform, CUDA, which speeds up the processing of high-dimensional text data and generates fast results. The efficacy of the proposed model has been demonstrated on the customer complaints data regarding the products and services of four leading Indian banks. CUDASOM achieved an average speed up of 44 times. Our approach can expand research into intelligent grievance redressal system to provide rapid solutions to the complaining customers.