HCMar 24
Human vs. NAO: A Computational-Behavioral Framework for Quantifying Social Orienting in Autism and Typical DevelopmentVartika Narayani Srinet, Anirudha Bhattacharjee, Braj Bhushan et al.
Responding to one's name is among the earliest-emerging social orienting behaviors and is one of the most prominent aspects in the detection of Autism Spectrum Disorder (ASD). Typically developing children exhibit near-reflexive orienting to their name, whereas children with ASD often demonstrate reduced frequency, increased latency, or atypical patterns of response. In this study, we examine differential responsiveness to quantify name-calling stimuli delivered by both human agents and NAO, a humanoid robot widely employed in socially assistive interventions for autism. The analysis focuses on multiple behavioral parameters, including eye contact, response latency, head and facial orientation shifts, and duration of sustained interest. Video-based computational methods were employed, incorporating face detection, eye region tracking, and spatio-temporal facial analysis, to obtain fine-grained measures of children's responses. By comparing neurotypical and neuroatypical groups under controlled human-robot conditions, this work aims to understand how the source and modality of social cues affect attentional dynamics in name-calling contexts. The findings advance both the theoretical understanding of social orienting deficits in autism and the applied development of robot-assisted assessment tools.
CVDec 27, 2024
A Hybrid Technique for Plant Disease Identification and Localisation in Real-timeMahendra Kumar Gohil, Anirudha Bhattacharjee, Rwik Rana et al.
Over the past decade, several image-processing methods and algorithms have been proposed for identifying plant diseases based on visual data. DNN (Deep Neural Networks) have recently become popular for this task. Both traditional image processing and DNN-based methods encounter significant performance issues in real-time detection owing to computational limitations and a broad spectrum of plant disease features. This article proposes a novel technique for identifying and localising plant disease based on the Quad-Tree decomposition of an image and feature learning simultaneously. The proposed algorithm significantly improves accuracy and faster convergence in high-resolution images with relatively low computational load. Hence it is ideal for deploying the algorithm in a standalone processor in a remotely operated image acquisition and disease detection system, ideally mounted on drones and robots working on large agricultural fields. The technique proposed in this article is hybrid as it exploits the advantages of traditional image processing methods and DNN-based models at different scales, resulting in faster inference. The F1 score is approximately 0.80 for four disease classes corresponding to potato and tomato crops.
CVDec 13, 2025
A Hybrid Deep Learning Framework for Emotion Recognition in Children with Autism During NAO Robot-Mediated InteractionIndranil Bhattacharjee, Vartika Narayani Srinet, Anirudha Bhattacharjee et al.
Understanding emotional responses in children with Autism Spectrum Disorder (ASD) during social interaction remains a critical challenge in both developmental psychology and human-robot interaction. This study presents a novel deep learning pipeline for emotion recognition in autistic children in response to a name-calling event by a humanoid robot (NAO), under controlled experimental settings. The dataset comprises of around 50,000 facial frames extracted from video recordings of 15 children with ASD. A hybrid model combining a fine-tuned ResNet-50-based Convolutional Neural Network (CNN) and a three-layer Graph Convolutional Network (GCN) trained on both visual and geometric features extracted from MediaPipe FaceMesh landmarks. Emotions were probabilistically labeled using a weighted ensemble of two models: DeepFace's and FER, each contributing to soft-label generation across seven emotion classes. Final classification leveraged a fused embedding optimized via Kullback-Leibler divergence. The proposed method demonstrates robust performance in modeling subtle affective responses and offers significant promise for affective profiling of ASD children in clinical and therapeutic human-robot interaction contexts, as the pipeline effectively captures micro emotional cues in neurodivergent children, addressing a major gap in autism-specific HRI research. This work represents the first such large-scale, real-world dataset and pipeline from India on autism-focused emotion analysis using social robotics, contributing an essential foundation for future personalized assistive technologies.
CVDec 26, 2024
A Lightweight Transformer with Phase-Only Cross-Attention for Illumination-Invariant Biometric AuthenticationArun K. Sharma, Shubhobrata Bhattacharya, Motahar Reza et al.
Traditional biometric systems have encountered significant setbacks due to various unavoidable factors, for example, wearing of face masks in face recognition-based biometrics and hygiene concerns in fingerprint-based biometrics. This paper proposes a novel lightweight vision transformer with phase-only cross-attention (POC-ViT) using dual biometric traits of forehead and periocular portions of the face, capable of performing well even with face masks and without any physical touch, offering a promising alternative to traditional methods. The POC-ViT framework is designed to handle two biometric traits and to capture inter-dependencies in terms of relative structural patterns. Each channel consists of a Cross-Attention using phase-only correlation (POC) that captures both their individual and correlated structural patterns. The computation of cross-attention using POC extracts the phase correlation in the spatial features. Therefore, it is robust against variations in resolution and intensity, as well as illumination changes in the input images. The lightweight model is suitable for edge device deployment. The performance of the proposed framework was successfully demonstrated using the Forehead Subcutaneous Vein Pattern and Periocular Biometric Pattern (FSVP-PBP) database, having 350 subjects. The POC-ViT framework outperformed state-of-the-art methods with an outstanding classification accuracy of $98.8\%$ with the dual biometric traits.
LGDec 3, 2024
AI-driven Inverse Design of Band-Tunable Mechanical Metastructures for Tailored Vibration MitigationTanuj Gupta, Arun Kumar Sharma, Ankur Dwivedi et al.
On-demand vibration mitigation in a mechanical system needs the suitable design of multiscale metastructures, involving complex unit cells. In this study, immersing in the world of patterns and examining the structural details of some interesting motifs are extracted from the mechanical metastructure perspective. Nine interlaced metastructures are fabricated using additive manufacturing, and corresponding vibration characteristics are studied experimentally and numerically. Further, the band-gap modulation with metallic inserts in the honeycomb interlaced metastructures is also studied. AI-driven inverse design of such complex metastructures with a desired vibration mitigation profile can pave the way for addressing engineering challenges in high-precision manufacturing. The current inverse design methodologies are limited to designing simple periodic structures based on limited variants of unit cells. Therefore, a novel forward analysis model with multi-head FEM-inspired spatial attention (FSA) is proposed to learn the complex geometry of the metastructures and predict corresponding transmissibility. Subsequently, a multiscale Gaussian self-attention (MGSA) based inverse design model with Gaussian function for 1D spectrum position encoding is developed to produce a suitable metastructure for the desired vibration transmittance. The proposed AI framework demonstrated outstanding performance corresponding to the expected locally resonant bandgaps in a targeted frequency range.