Mrinal Haloi

h-index9

15papers

1,163citations

Novelty40%

AI Score26

Ranked #158,770 of 194,257 authors (top 82%)#51,414 in CV (top 87%)

15 Papers

4.8CVAug 31, 2022Code

Table Detection in the Wild: A Novel Diverse Table Detection Dataset and Method

Mrinal Haloi, Shashank Shekhar, Nikhil Fande et al.

Recent deep learning approaches in table detection achieved outstanding performance and proved to be effective in identifying document layouts. Currently, available table detection benchmarks have many limitations, including the lack of samples diversity, simple table structure, the lack of training cases, and samples quality. In this paper, we introduce a diverse large-scale dataset for table detection with more than seven thousand samples containing a wide variety of table structures collected from many diverse sources. In addition to that, we also present baseline results using a convolutional neural network-based method to detect table structure in documents. Experimental results show the superiority of applying convolutional deep learning methods over classical computer vision-based methods. The introduction of this diverse table detection dataset will enable the community to develop high throughput deep learning methods for understanding document layout and tabular data processing. Dataset is available at: 1. https://www.kaggle.com/datasets/mrinalim/stdw-dataset 2. https://huggingface.co/datasets/n3011/STDW

1.4CVJan 19, 2022

An Ensemble Model for Face Liveness Detection

Shashank Shekhar, Avinash Patel, Mrinal Haloi et al.

In this paper, we present a passive method to detect face presentation attack a.k.a face liveness detection using an ensemble deep learning technique. Face liveness detection is one of the key steps involved in user identity verification of customers during the online onboarding/transaction processes. During identity verification, an unauthenticated user tries to bypass the verification system by several means, for example, they can capture a user photo from social media and do an imposter attack using printouts of users faces or using a digital photo from a mobile device and even create a more sophisticated attack like video replay attack. We have tried to understand the different methods of attack and created an in-house large-scale dataset covering all the kinds of attacks to train a robust deep learning model. We propose an ensemble method where multiple features of the face and background regions are learned to predict whether the user is a bonafide or an attacker.

1.7CVDec 12, 2018

Towards Ophthalmologist Level Accurate Deep Learning System for OCT Screening and Diagnosis

Mrinal Haloi

In this work, we propose an advanced AI based grading system for OCT images. The proposed system is a very deep fully convolutional attentive classification network trained with end to end advanced transfer learning with online random augmentation. It uses quasi random augmentation that outputs confidence values for diseases prevalence during inference. Its a fully automated retinal OCT analysis AI system capable of pathological lesions understanding without any offline preprocessing/postprocessing step or manual feature extraction. We present a state of the art performance on the publicly available Mendeley OCT dataset.

2.5CVJun 25, 2018

Towards Radiologist-Level Accurate Deep Learning System for Pulmonary Screening

Mrinal Haloi, K. Raja Rajalakshmi, Pradeep Walia

In this work, we propose advanced pneumonia and Tuberculosis grading system for X-ray images. The proposed system is a very deep fully convolutional classification network with online augmentation that outputs confidence values for diseases prevalence. Its a fully automated system capable of disease feature understanding without any offline preprocessing step or manual feature extraction. We have achieved state- of-the- art performance on the public databases such as ChestXray-14, Mendeley, Shenzhen Hospital X-ray and Belarus X-ray set.

3.2LGOct 22, 2017

Rethinking Convolutional Semantic Segmentation Learning

Mrinal Haloi

Deep convolutional semantic segmentation (DCSS) learning doesn't converge to an optimal local minimum with random parameters initializations; a pre-trained model on the same domain becomes necessary to achieve convergence.In this work, we propose a joint cooperative end-to-end learning method for DCSS. It addresses many drawbacks with existing deep semantic segmentation learning; the proposed approach simultaneously learn both segmentation and classification; taking away the essential need of the pre-trained model for learning convergence. We present an improved inception based architecture with partial attention gating (PAG) over encoder information. The PAG also adds to achieve faster convergence and better accuracy for segmentation task. We will show the effectiveness of this learning on a diabetic retinopathy classification and segmentation dataset.

2.6LGJun 6, 2017

Deep Learning: Generalization Requires Deep Compositional Feature Space Design

Mrinal Haloi

Generalization error defines the discriminability and the representation power of a deep model. In this work, we claim that feature space design using deep compositional function plays a significant role in generalization along with explicit and implicit regularizations. Our claims are being established with several image classification experiments. We show that the information loss due to convolution and max pooling can be marginalized with the compositional design, improving generalization performance. Also, we will show that learning rate decay acts as an implicit regularizer in deep model training.

28.1CVJul 28, 2016

Gated Siamese Convolutional Neural Network Architecture for Human Re-Identification

Rahul Rama Varior, Mrinal Haloi, Gang Wang

Matching pedestrians across multiple camera views, known as human re-identification, is a challenging research problem that has numerous applications in visual surveillance. With the resurgence of Convolutional Neural Networks (CNNs), several end-to-end deep Siamese CNN architectures have been proposed for human re-identification with the objective of projecting the images of similar pairs (i.e. same identity) to be closer to each other and those of dissimilar pairs to be distant from each other. However, current networks extract fixed representations for each image regardless of other images which are paired with it and the comparison with other images is done only at the final level. In this setting, the network is at risk of failing to extract finer local patterns that may be essential to distinguish positive pairs from hard negative pairs. In this paper, we propose a gating function to selectively emphasize such fine common local patterns by comparing the mid-level features across pairs of images. This produces flexible representations for the same image according to the images they are paired with. We conduct experiments on the CUHK03, Market-1501 and VIPeR datasets and demonstrate improved performance compared to a baseline Siamese CNN architecture.

1.1CVJan 25, 2016

An Unsupervised Method for Detection and Validation of The Optic Disc and The Fovea

Mrinal Haloi, Samarendra Dandapat, Rohit Sinha

In this work, we have presented a novel method for detection of retinal image features, the optic disc and the fovea, from colour fundus photographs of dilated eyes for Computer-aided Diagnosis(CAD) system. A saliency map based method was used to detect the optic disc followed by an unsupervised probabilistic Latent Semantic Analysis for detection validation. The validation concept is based on distinct vessels structures in the optic disc. By using the clinical information of standard location of the fovea with respect to the optic disc, the macula region is estimated. Accuracy of 100\% detection is achieved for the optic disc and the macula on MESSIDOR and DIARETDB1 and 98.8\% detection accuracy on STARE dataset.

9.1CVNov 10, 2015Code

Traffic Sign Classification Using Deep Inception Based Convolutional Networks

Mrinal Haloi

In this work, we propose a novel deep network for traffic sign classification that achieves outstanding performance on GTSRB surpassing all previous methods. Our deep network consists of spatial transformer layers and a modified version of inception module specifically designed for capturing local and global features together. This features adoption allows our network to classify precisely intraclass samples even under deformations. Use of spatial transformer layer makes this network more robust to deformations such as translation, rotation, scaling of input images. Unlike existing approaches that are developed with hand-crafted features, multiple deep networks with huge parameters and data augmentations, our method addresses the concern of exploding parameters and augmentations. We have achieved the state-of-the-art performance of 99.81\% on GTSRB dataset.

12.7CVMay 17, 2015

Improved Microaneurysm Detection using Deep Neural Networks

Mrinal Haloi

In this work, we propose a novel microaneurysm (MA) detection for early diabetic retinopathy screening using color fundus images. Since MA usually the first lesions to appear as an indicator of diabetic retinopathy, accurate detection of MA is necessary for treatment. Each pixel of the image is classified as either MA or non-MA using a deep neural network with dropout training procedure using maxout activation function. No preprocessing step or manual feature extraction is required. Substantial improvements over standard MA detection method based on the pipeline of preprocessing, feature extraction, classification followed by post processing is achieved. The presented method is evaluated in publicly available Retinopathy Online Challenge (ROC) and Diaretdb1v2 database and achieved state-of-the-art accuracy.

7.0CVMay 4, 2015

A Gaussian Scale Space Approach For Exudates Detection, Classification And Severity Prediction

Mrinal Haloi, Samarendra Dandapat, Rohit Sinha

In the context of Computer Aided Diagnosis system for diabetic retinopathy, we present a novel method for detection of exudates and their classification for disease severity prediction. The method is based on Gaussian scale space based interest map and mathematical morphology. It makes use of support vector machine for classification and location information of the optic disc and the macula region for severity prediction. It can efficiently handle luminance variation and it is suitable for varied sized exudates. The method has been probed in publicly available DIARETDB1V2 and e-ophthaEX databases. For exudate detection the proposed method achieved a sensitivity of 96.54% and prediction of 98.35% in DIARETDB1V2 database.

2.5CVApr 28, 2015

A Robust Lane Detection and Departure Warning System

Mrinal Haloi, Dinesh Babu Jayagopi

In this work, we have developed a robust lane detection and departure warning technique. Our system is based on single camera sensor. For lane detection a modified Inverse Perspective Mapping using only a few extrinsic camera parameters and illuminant Invariant techniques is used. Lane markings are represented using a combination of 2nd and 4th order steerable filters, robust to shadowing. Effect of shadowing and extra sun light are removed using Lab color space, and illuminant invariant representation. Lanes are assumed to be cubic curves and fitted using robust RANSAC. This method can reliably detect lanes of the road and its boundary. This method has been experimented in Indian road conditions under different challenging situations and the result obtained were very good. For lane departure angle an optical flow based method were used.

4.5CVMar 23, 2015

Vehicle Local Position Estimation System

Mrinal Haloi, Dinesh Babu Jayagopi

In this paper, a robust vehicle local position estimation with the help of single camera sensor and GPS is presented. A modified Inverse Perspective Mapping, illuminant Invariant techniques and object detection based approach is used to localize the vehicle in the road. Vehicles current lane, its position from road boundary and other cars are used to define its local position. For this purpose Lane markings are detected using a Laplacian edge feature, robust to shadowing. Effect of shadowing and extra sun light are removed using Lab color space and illuminant invariant techniques. Lanes are assumed to be as parabolic model and fitted using robust RANSAC. This method can reliably detect all lanes of the road, estimate lane departure angle and local position of vehicle relative to lanes, road boundary and other cars. Different type of obstacle like pedestrians, vehicles are detected using HOG feature based deformable part model.

6.3CVMar 23, 2015

A novel pLSA based Traffic Signs Classification System

Mrinal Haloi

In this work we developed a novel and fast traffic sign recognition system, a very important part for advanced driver assistance system and for autonomous driving. Traffic signs play a very vital role in safe driving and avoiding accident. We have used image processing and topic discovery model pLSA to tackle this challenging multiclass classification problem. Our algorithm is consist of two parts, shape classification and sign classification for improved accuracy. For processing and representation of image we have used bag of features model with SIFT local descriptor. Where a visual vocabulary of size 300 words are formed using k-means codebook formation algorithm. We exploited the concept that every image is a collection of visual topics and images having same topics will belong to same category. Our algorithm is tested on German traffic sign recognition benchmark (GTSRB) and gives very promising result near to existing state of the art techniques.

1.3CVMar 13, 2015

Characterizing driving behavior using automatic visual analysis

Mrinal Haloi, Dinesh Babu Jayagopi

In this work, we present the problem of rash driving detection algorithm using a single wide angle camera sensor, particularly useful in the Indian context. To our knowledge this rash driving problem has not been addressed using Image processing techniques (existing works use other sensors such as accelerometer). Car Image processing literature, though rich and mature, does not address the rash driving problem. In this work-in-progress paper, we present the need to address this problem, our approach and our future plans to build a rash driving detector.