Anil Kumar

CV
3papers
15citations
Novelty42%
AI Score36

3 Papers

CVJul 23, 2024Code
A Multitask Deep Learning Model for Classification and Regression of Hyperspectral Images: Application to the large-scale dataset

Koushikey Chhapariya, Alexandre Benoit, Krishna Mohan Buddhiraju et al.

Multitask learning is a widely recognized technique in the field of computer vision and deep learning domain. However, it is still a research question in remote sensing, particularly for hyperspectral imaging. Moreover, most of the research in the remote sensing domain focuses on small and single-task-based annotated datasets, which limits the generalizability and scalability of the developed models to more diverse and complex real-world scenarios. Thus, in this study, we propose a multitask deep learning model designed to perform multiple classification and regression tasks simultaneously on hyperspectral images. We validated our approach on a large hyperspectral dataset called TAIGA, which contains 13 forest variables, including three categorical variables and ten continuous variables with different biophysical parameters. We design a sharing encoder and task-specific decoder network to streamline feature learning while allowing each task-specific decoder to focus on the unique aspects of its respective task. Additionally, a dense atrous pyramid pooling layer and attention network were integrated to extract multi-scale contextual information and enable selective information processing by prioritizing task-specific features. Further, we computed multitask loss and optimized its parameters for the proposed framework to improve the model performance and efficiency across diverse tasks. A comprehensive qualitative and quantitative analysis of the results shows that the proposed method significantly outperforms other state-of-the-art methods. We trained our model across 10 seeds/trials to ensure robustness. Our proposed model demonstrates higher mean performance while maintaining lower or equivalent variability. To make the work reproducible, the codes will be available at https://github.com/Koushikey4596/Multitask-Deep-Learning-Model-for-Taiga-datatset.

CVNov 21, 2025
Real-Time Cooked Food Image Synthesis and Visual Cooking Progress Monitoring on Edge Devices

Jigyasa Gupta, Soumya Goyal, Anil Kumar et al.

Synthesizing realistic cooked food images from raw inputs on edge devices is a challenging generative task, requiring models to capture complex changes in texture, color and structure during cooking. Existing image-to-image generation methods often produce unrealistic results or are too resource-intensive for edge deployment. We introduce the first oven-based cooking-progression dataset with chef-annotated doneness levels and propose an edge-efficient recipe and cooking state guided generator that synthesizes realistic food images conditioned on raw food image. This formulation enables user-preferred visual targets rather than fixed presets. To ensure temporal consistency and culinary plausibility, we introduce a domain-specific \textit{Culinary Image Similarity (CIS)} metric, which serves both as a training loss and a progress-monitoring signal. Our model outperforms existing baselines with significant reductions in FID scores (30\% improvement on our dataset; 60\% on public datasets)

IVMay 4, 2019
Deep 3D Convolutional Neural Network for Automated Lung Cancer Diagnosis

Sumita Mishra, Naresh Kumar Chaudhary, Pallavi Asthana et al.

Computer Aided Diagnosis has emerged as an indispensible technique for validating the opinion of radiologists in CT interpretation. This paper presents a deep 3D Convolutional Neural Network (CNN) architecture for automated CT scan-based lung cancer detection system. It utilizes three dimensional spatial information to learn highly discriminative 3 dimensional features instead of 2D features like texture or geometric shape whick need to be generated manually. The proposed deep learning method automatically extracts the 3D features on the basis of spatio-temporal statistics.The developed model is end-to-end and is able to predict malignancy of each voxel for given input scan. Simulation results demonstrate the effectiveness of proposed 3D CNN network for classification of lung nodule in-spite of limited computational capabilities.