CVJun 17, 2021Code
SE-MD: A Single-encoder multiple-decoder deep network for point cloud generation from 2D imagesAbdul Mueed Hafiz, Rouf Ul Alam Bhat, Shabir Ahmad Parah et al.
3D model generation from single 2D RGB images is a challenging and actively researched computer vision task. Various techniques using conventional network architectures have been proposed for the same. However, the body of research work is limited and there are various issues like using inefficient 3D representation formats, weak 3D model generation backbones, inability to generate dense point clouds, dependence of post-processing for generation of dense point clouds, and dependence on silhouettes in RGB images. In this paper, a novel 2D RGB image to point cloud conversion technique is proposed, which improves the state of art in the field due to its efficient, robust and simple model by using the concept of parallelization in network architecture. It not only uses the efficient and rich 3D representation of point clouds, but also uses a novel and robust point cloud generation backbone in order to address the prevalent issues. This involves using a single-encoder multiple-decoder deep network architecture wherein each decoder generates certain fixed viewpoints. This is followed by fusing all the viewpoints to generate a dense point cloud. Various experiments are conducted on the technique and its performance is compared with those of other state of the art techniques and impressive gains in performance are demonstrated. Code is available at https://github.com/mueedhafiz1982/
CVJul 23, 2020Code
Deep Network Ensemble Learning applied to Image Classification using CNN TreesAbdul Mueed Hafiz, Ghulam Mohiuddin Bhat
Traditional machine learning approaches may fail to perform satisfactorily when dealing with complex data. In this context, the importance of data mining evolves w.r.t. building an efficient knowledge discovery and mining framework. Ensemble learning is aimed at integration of fusion, modeling and mining of data into a unified model. However, traditional ensemble learning methods are complex and have optimization or tuning problems. In this paper, we propose a simple, sequential, efficient, ensemble learning approach using multiple deep networks. The deep network used in the ensembles is ResNet50. The model draws inspiration from binary decision/classification trees. The proposed approach is compared against the baseline viz. the single classifier approach i.e. using a single multiclass ResNet50 on the ImageNet and Natural Images datasets. Our approach outperforms the baseline on all experiments on the ImageNet dataset. Code is available in https://github.com/mueedhafiz1982/CNNTreeEnsemble.git
CVJun 3, 2021
Attention mechanisms and deep learning for machine vision: A survey of the state of the artAbdul Mueed Hafiz, Shabir Ahmad Parah, Rouf Ul Alam Bhat
With the advent of state of the art nature-inspired pure attention based models i.e. transformers, and their success in natural language processing (NLP), their extension to machine vision (MV) tasks was inevitable and much felt. Subsequently, vision transformers (ViTs) were introduced which are giving quite a challenge to the established deep learning based machine vision techniques. However, pure attention based models/architectures like transformers require huge data, large training times and large computational resources. Some recent works suggest that combinations of these two varied fields can prove to build systems which have the advantages of both these fields. Accordingly, this state of the art survey paper is introduced which hopefully will help readers get useful information about this interesting and potential research area. A gentle introduction to attention mechanisms is given, followed by a discussion of the popular attention based deep architectures. Subsequently, the major categories of the intersection of attention mechanisms and deep learning for machine vision (MV) based are discussed. Afterwards, the major algorithms, issues and trends within the scope of the paper are discussed.
LGAug 6, 2020
Deep Q-Network Based Multi-agent Reinforcement Learning with Binary Action AgentsAbdul Mueed Hafiz, Ghulam Mohiuddin Bhat
Deep Q-Network (DQN) based multi-agent systems (MAS) for reinforcement learning (RL) use various schemes where in the agents have to learn and communicate. The learning is however specific to each agent and communication may be satisfactorily designed for the agents. As more complex Deep QNetworks come to the fore, the overall complexity of the multi-agent system increases leading to issues like difficulty in training, need for higher resources and more training time, difficulty in fine-tuning, etc. To address these issues we propose a simple but efficient DQN based MAS for RL which uses shared state and rewards, but agent-specific actions, for updation of the experience replay pool of the DQNs, where each agent is a DQN. The benefits of the approach are overall simplicity, faster convergence and better performance as compared to conventional DQN based approaches. It should be noted that the method can be extended to any DQN. As such we use simple DQN and DDQN (Double Q-learning) respectively on three separate tasks i.e. Cartpole-v1 (OpenAI Gym environment) , LunarLander-v2 (OpenAI Gym environment) and Maze Traversal (customized environment). The proposed approach outperforms the baseline on these tasks by decent margins respectively.
CVJun 28, 2020
Digit Image Recognition Using an Ensemble of One-Versus-All Deep Network ClassifiersAbdul Mueed Hafiz, Mahmoud Hassaballah
In multiclass deep network classifiers, the burden of classifying samples of different classes is put on a single classifier. As the result the optimum classification accuracy is not obtained. Also training times are large due to running the CNN training on single CPU/GPU. However it is known that using ensembles of classifiers increases the performance. Also, the training times can be reduced by running each member of the ensemble on a separate processor. Ensemble learning has been used in the past for traditional methods to a varying extent and is a hot topic. With the advent of deep learning, ensemble learning has been applied to the former as well. However, an area which is unexplored and has potential is One-Versus-All (OVA) deep ensemble learning. In this paper we explore it and show that by using OVA ensembles of deep networks, improvements in performance of deep networks can be obtained. As shown in this paper, the classification capability of deep networks can be further increased by using an ensemble of binary classification (OVA) deep networks. We implement a novel technique for the case of digit image recognition and test and evaluate it on the same. In the proposed approach, a single OVA deep network classifier is dedicated to each category. Subsequently, OVA deep network ensembles have been investigated. Every network in an ensemble has been trained by an OVA training technique using the Stochastic Gradient Descent with Momentum Algorithm (SGDMA). For classification of a test sample, the sample is presented to each network in the ensemble. After prediction score voting, the network with the largest score is assumed to have classified the sample. The experimentation has been done on the MNIST digit dataset, the USPS+ digit dataset, and MATLAB digit image dataset. Our proposed technique outperforms the baseline on digit image recognition for all datasets.
CVJun 28, 2020
Image Classification by Reinforcement Learning with Two-State Q-LearningAbdul Mueed Hafiz
In this paper, a simple and efficient Hybrid Classifier is presented which is based on deep learning and reinforcement learning. Here, Q-Learning has been used with two states and 'two or three' actions. Other techniques found in the literature use feature map extracted from Convolutional Neural Networks and use these in the Q-states along with past history. This leads to technical difficulties in these approaches because the number of states is high due to large dimensions of the feature map. Because the proposed technique uses only two Q-states it is straightforward and consequently has much lesser number of optimization parameters, and thus also has a simple reward function. Also, the proposed technique uses novel actions for processing images as compared to other techniques found in literature. The performance of the proposed technique is compared with other recent algorithms like ResNet50, InceptionV3, etc. on popular databases including ImageNet, Cats and Dogs Dataset, and Caltech-101 Dataset. The proposed approach outperforms others techniques on all the datasets used.
CVJun 28, 2020
Fast Training of Deep Networks with One-Class CNNsAbdul Mueed Hafiz, Ghulam Mohiuddin Bhat
One-class CNNs have shown promise in novelty detection. However, very less work has been done on extending them to multiclass classification. The proposed approach is a viable effort in this direction. It uses one-class CNNs i.e., it trains one CNN per class, for multiclass classification. An ensemble of such one-class CNNs is used for multiclass classification. The benefits of the approach are generally better recognition accuracy while taking almost even half or two-thirds of the training time of a conventional multi-class deep network. The proposed approach has been applied successfully to face recognition and object recognition tasks. For face recognition, a 1000 frame RGB video, featuring many faces together, has been used for benchmarking of the proposed approach. Its database is available on request via e-mail. For object recognition, the Caltech-101 Image Database and 17Flowers Dataset have also been used. The experimental results support the claims made.
CVJun 28, 2020
A Survey on Instance Segmentation: State of the artAbdul Mueed Hafiz, Ghulam Mohiuddin Bhat
Object detection or localization is an incremental step in progression from coarse to fine digital image inference. It not only provides the classes of the image objects, but also provides the location of the image objects which have been classified. The location is given in the form of bounding boxes or centroids. Semantic segmentation gives fine inference by predicting labels for every pixel in the input image. Each pixel is labelled according to the object class within which it is enclosed. Furthering this evolution, instance segmentation gives different labels for separate instances of objects belonging to the same class. Hence, instance segmentation may be defined as the technique of simultaneously solving the problem of object detection as well as that of semantic segmentation. In this survey paper on instance segmentation -- its background, issues, techniques, evolution, popular datasets, related work up to the state of the art and future scope have been discussed. The paper provides valuable information for those who want to do research in the field of instance segmentation.
CVJun 28, 2020
Reinforcement Learning Based Handwritten Digit Recognition with Two-State Q-LearningAbdul Mueed Hafiz, Ghulam Mohiuddin Bhat
We present a simple yet efficient Hybrid Classifier based on Deep Learning and Reinforcement Learning. Q-Learning is used with two Q-states and four actions. Conventional techniques use feature maps extracted from Convolutional Neural Networks (CNNs) and include them in the Qstates along with past history. This leads to difficulties with these approaches as the number of states is very large number due to high dimensions of the feature maps. Since our method uses only two Q-states it is simple and has much lesser number of parameters to optimize and also thus has a straightforward reward function. Also, the approach uses unexplored actions for image processing vis-a-vis other contemporary techniques. Three datasets have been used for benchmarking of the approach. These are the MNIST Digit Image Dataset, the USPS Digit Image Dataset and the MATLAB Digit Image Dataset. The performance of the proposed hybrid classifier has been compared with other contemporary techniques like a well-established Reinforcement Learning Technique, AlexNet, CNN-Nearest Neighbor Classifier and CNNSupport Vector Machine Classifier. Our approach outperforms these contemporary hybrid classifiers on all the three datasets used.