Nirmalya Roy

CV
h-index9
24papers
247citations
Novelty41%
AI Score49

24 Papers

LGAug 13, 2022
Demo: RhythmEdge: Enabling Contactless Heart Rate Estimation on the Edge

Zahid Hasan, Emon Dey, Sreenivasan Ramasamy Ramamurthy et al.

In this demo paper, we design and prototype RhythmEdge, a low-cost, deep-learning-based contact-less system for regular HR monitoring applications. RhythmEdge benefits over existing approaches by facilitating contact-less nature, real-time/offline operation, inexpensive and available sensing components, and computing devices. Our RhythmEdge system is portable and easily deployable for reliable HR estimation in moderately controlled indoor or outdoor environments. RhythmEdge measures HR via detecting changes in blood volume from facial videos (Remote Photoplethysmography; rPPG) and provides instant assessment using off-the-shelf commercially available resource-constrained edge platforms and video cameras. We demonstrate the scalability, flexibility, and compatibility of the RhythmEdge by deploying it on three resource-constrained platforms of differing architectures (NVIDIA Jetson Nano, Google Coral Development Board, Raspberry Pi) and three heterogeneous cameras of differing sensitivity, resolution, properties (web camera, action camera, and DSLR). RhythmEdge further stores longitudinal cardiovascular information and provides instant notification to the users. We thoroughly test the prototype stability, latency, and feasibility for three edge computing platforms by profiling their runtime, memory, and power usage.

CVApr 18Code
Lorentz Framework for Semantic Segmentation

Zahid Hasan, Masud Ahmed, Nirmalya Roy

Semantic segmentation in hyperbolic space enables compact modeling of hierarchical structure while providing inherent uncertainty quantification. Prior approaches predominantly rely on the Poincaré ball model, which suffers from numerical instability, optimization, and computational challenges. We propose a novel, tractable, architecture-agnostic semantic segmentation framework (pixel-wise and mask classification) in the hyperbolic Lorentz model. We employ text embeddings with semantic and visual cues to guide hierarchical pixel-level representations in Lorentz space. This enables stable and efficient optimization without requiring a Riemannian optimizer, and easily integrates with existing Euclidean architectures. Beyond segmentation, our approach yields free uncertainty estimation, confidence map, boundary delineation, hierarchical and text-based retrieval, and zero-shot performance, reaching generalized flatter minima. We introduce a novel uncertainty and confidence indicator in Lorentz cone embeddings. Further, we provide analytical and empirical insights into Lorentz optimization via gradient analysis. Extensive experiments on ADE20K, COCO-Stuff-164k, Pascal-VOC, and Cityscapes, utilizing state-of-the-art per-pixel classification models (DeepLabV3 and SegFormer) and mask classification models (mask2former and maskformer), validate the effectiveness and generality of our approach. Our results demonstrate the potential of hyperbolic Lorentz embeddings for robust and uncertainty-aware semantic segmentation. Code is available at https://github.com/mxahan/Lorentz_semantic_segmentation.

LGMar 28, 2023
Crime Prediction Using Machine Learning and Deep Learning: A Systematic Review and Future Directions

Varun Mandalapu, Lavanya Elluri, Piyush Vyas et al.

Predicting crime using machine learning and deep learning techniques has gained considerable attention from researchers in recent years, focusing on identifying patterns and trends in crime occurrences. This review paper examines over 150 articles to explore the various machine learning and deep learning algorithms applied to predict crime. The study provides access to the datasets used for crime prediction by researchers and analyzes prominent approaches applied in machine learning and deep learning algorithms to predict crime, offering insights into different trends and factors related to criminal activities. Additionally, the paper highlights potential gaps and future directions that can enhance the accuracy of crime prediction. Finally, the comprehensive overview of research discussed in this paper on crime prediction using machine learning and deep learning approaches serves as a valuable reference for researchers in this field. By gaining a deeper understanding of crime prediction techniques, law enforcement agencies can develop strategies to prevent and respond to criminal activities more effectively.

SPApr 7, 2023
Domain Adaptation for Inertial Measurement Unit-based Human Activity Recognition: A Survey

Avijoy Chakma, Abu Zaher Md Faridee, Indrajeet Ghosh et al.

Machine learning-based wearable human activity recognition (WHAR) models enable the development of various smart and connected community applications such as sleep pattern monitoring, medication reminders, cognitive health assessment, sports analytics, etc. However, the widespread adoption of these WHAR models is impeded by their degraded performance in the presence of data distribution heterogeneities caused by the sensor placement at different body positions, inherent biases and heterogeneities across devices, and personal and environmental diversities. Various traditional machine learning algorithms and transfer learning techniques have been proposed in the literature to address the underpinning challenges of handling such data heterogeneities. Domain adaptation is one such transfer learning techniques that has gained significant popularity in recent literature. In this paper, we survey the recent progress of domain adaptation techniques in the Inertial Measurement Unit (IMU)-based human activity recognition area, discuss potential future directions.

LGOct 22, 2022
Mixed Precision Quantization to Tackle Gradient Leakage Attacks in Federated Learning

Pretom Roy Ovi, Emon Dey, Nirmalya Roy et al.

Federated Learning (FL) enables collaborative model building among a large number of participants without the need for explicit data sharing. But this approach shows vulnerabilities when privacy inference attacks are applied to it. In particular, in the event of a gradient leakage attack, which has a higher success rate in retrieving sensitive data from the model gradients, FL models are at higher risk due to the presence of communication in their inherent architecture. The most alarming thing about this gradient leakage attack is that it can be performed in such a covert way that it does not hamper the training performance while the attackers backtrack from the gradients to get information about the raw data. Two of the most common approaches proposed as solutions to this issue are homomorphic encryption and adding noise with differential privacy parameters. These two approaches suffer from two major drawbacks. They are: the key generation process becomes tedious with the increasing number of clients, and noise-based differential privacy suffers from a significant drop in global model accuracy. As a countermeasure, we propose a mixed-precision quantized FL scheme, and we empirically show that both of the issues addressed above can be resolved. In addition, our approach can ensure more robustness as different layers of the deep model are quantized with different precision and quantization modes. We empirically proved the validity of our method with three benchmark datasets and found a minimal accuracy drop in the global model after applying quantization.

CVApr 14, 2023
NEV-NCD: Negative Learning, Entropy, and Variance regularization based novel action categories discovery

Zahid Hasan, Masud Ahmed, Abu Zaher Md Faridee et al.

Novel Categories Discovery (NCD) facilitates learning from a partially annotated label space and enables deep learning (DL) models to operate in an open-world setting by identifying and differentiating instances of novel classes based on the labeled data notions. One of the primary assumptions of NCD is that the novel label space is perfectly disjoint and can be equipartitioned, but it is rarely realized by most NCD approaches in practice. To better align with this assumption, we propose a novel single-stage joint optimization-based NCD method, Negative learning, Entropy, and Variance regularization NCD (NEV-NCD). We demonstrate the efficacy of NEV-NCD in previously unexplored NCD applications of video action recognition (VAR) with the public UCF101 dataset and a curated in-house partial action-space annotated multi-view video dataset. We perform a thorough ablation study by varying the composition of final joint loss and associated hyper-parameters. During our experiments with UCF101 and multi-view action dataset, NEV-NCD achieves ~ 83% classification accuracy in test instances of labeled data. NEV-NCD achieves ~ 70% clustering accuracy over unlabeled data outperforming both naive baselines (by ~ 40%) and state-of-the-art pseudo-labeling-based approaches (by ~ 3.5%) over both datasets. Further, we propose to incorporate optional view-invariant feature learning with the multiview dataset to identify novel categories from novel viewpoints. Our additional view-invariance constraint improves the discriminative accuracy for both known and unknown categories by ~ 10% for novel viewpoints.

CVJul 7, 2023
Novel Categories Discovery Via Constraints on Empirical Prediction Statistics

Zahid Hasan, Abu Zaher Md Faridee, Masud Ahmed et al.

Novel Categories Discovery (NCD) aims to cluster novel data based on the class semantics of known classes using the open-world partial class space annotated dataset. As an alternative to the traditional pseudo-labeling-based approaches, we leverage the connection between the data sampling and the provided multinoulli (categorical) distribution of novel classes. We introduce constraints on individual and collective statistics of predicted novel class probabilities to implicitly achieve semantic-based clustering. More specifically, we align the class neuron activation distributions under Monte-Carlo sampling of novel classes in large batches by matching their empirical first-order (mean) and second-order (covariance) statistics with the multinoulli distribution of the labels while applying instance information constraints and prediction consistency under label-preserving augmentations. We then explore a directional statistics-based probability formation that learns the mixture of Von Mises-Fisher distribution of class labels in a unit hypersphere. We demonstrate the discriminative ability of our approach to realize semantic clustering of novel samples in image, video, and time-series modalities. We perform extensive ablation studies regarding data, networks, and framework components to provide better insights. Our approach maintains 94%, 93%, 85%, and 93% (approx.) classification accuracy in labeled data while achieving 90%, 84%, 72%, and 75% (approx.) clustering accuracy for novel categories in Cifar10, UCF101, MPSC-ARL, and SHAR datasets that match state-of-the-art approaches without any external clustering.

LGApr 10, 2023
Recent Advancements in Machine Learning For Cybercrime Prediction

Lavanya Elluri, Varun Mandalapu, Piyush Vyas et al.

Cybercrime is a growing threat to organizations and individuals worldwide, with criminals using sophisticated techniques to breach security systems and steal sensitive data. This paper aims to comprehensively survey the latest advancements in cybercrime prediction, highlighting the relevant research. For this purpose, we reviewed more than 150 research articles and discussed 50 most recent and appropriate ones. We start the review with some standard methods cybercriminals use and then focus on the latest machine and deep learning techniques, which detect anomalous behavior and identify potential threats. We also discuss transfer learning, which allows models trained on one dataset to be adapted for use on another dataset. We then focus on active and reinforcement learning as part of early-stage algorithmic research in cybercrime prediction. Finally, we discuss critical innovations, research gaps, and future research opportunities in Cybercrime prediction. This paper presents a holistic view of cutting-edge developments and publicly available datasets.

ROAug 12, 2023
CoverNav: Cover Following Navigation Planning in Unstructured Outdoor Environment with Deep Reinforcement Learning

Jumman Hossain, Abu-Zaher Faridee, Nirmalya Roy et al.

Autonomous navigation in offroad environments has been extensively studied in the robotics field. However, navigation in covert situations where an autonomous vehicle needs to remain hidden from outside observers remains an underexplored area. In this paper, we propose a novel Deep Reinforcement Learning (DRL) based algorithm, called CoverNav, for identifying covert and navigable trajectories with minimal cost in offroad terrains and jungle environments in the presence of observers. CoverNav focuses on unmanned ground vehicles seeking shelters and taking covers while safely navigating to a predefined destination. Our proposed DRL method computes a local cost map that helps distinguish which path will grant the maximal covertness while maintaining a low cost trajectory using an elevation map generated from 3D point cloud data, the robot's pose, and directed goal information. CoverNav helps robot agents to learn the low elevation terrain using a reward function while penalizing it proportionately when it experiences high elevation. If an observer is spotted, CoverNav enables the robot to select natural obstacles (e.g., rocks, houses, disabled vehicles, trees, etc.) and use them as shelters to hide behind. We evaluate CoverNav using the Unity simulation environment and show that it guarantees dynamically feasible velocities in the terrain when fed with an elevation map generated by another DRL based navigation algorithm. Additionally, we evaluate CoverNav's effectiveness in achieving a maximum goal distance of 12 meters and its success rate in different elevation scenarios with and without cover objects. We observe competitive performance comparable to state of the art (SOTA) methods without compromising accuracy.

DCApr 17
DataCenterGym: A Physics-Grounded Simulator for Multi-Objective Data Center Scheduling

Nilavra Pathak, Samadrita Biswas, Nirmalya Roy

Modern datacenters schedule heterogeneous workloads across geo-distributed sites with diverse compute capacities, electricity prices, and thermal conditions. Compute utilization, heat generation, cooling demand, and energy consumption are tightly coupled, yet most existing schedulers abstract these effects and treat them independently. We present \textit{DataCenterGym}, a physics-grounded simulation environment for job scheduling in geo-distributed data centers, designed as a reusable testbed for future research. The simulator integrates compute queueing, building thermal dynamics, localized HVAC behavior, and temperature-dependent service degradation within a Gymnasium-compatible interface. We also develop a Hierarchical Model Predictive Control (H-MPC) scheduling algorithm that performs distributed job placement while explicitly accounting for thermal and power dynamics. Through experiments on nominal operation and workload sensitivity, we demonstrate how H-MPC improves scheduling performance relative to baseline schedulers.

ROMar 15
SERN: Bandwidth-Adaptive Cross-Reality Synchronization for Simulation-Enhanced Robot Navigation

Jumman Hossain, Emon Dey, Snehalraj Chugh et al.

Cross reality integration of simulation and physical robots is a promising approach for multi-robot operations in contested environments, where communication may be intermittent, interference may be present, and observability may be degraded. We present SERN (Simulation-Enhanced Realistic Navigation), a framework that tightly couples a high-fidelity virtual twin with physical robots to support real-time collaborative decision making. SERN makes three main contributions. First, it builds a virtual twin from geospatial and sensor data and continuously corrects it using live robot telemetry. Second, it introduces a physics-aware synchronization pipeline that combines predictive modeling with adaptive PD control. Third, it provides a bandwidth-adaptive ROS bridge that prioritizes critical topics when communication links are constrained. We also introduce a multi-metric cost function that balances latency, reliability, computation, and bandwidth. Theoretically, we show that when the adaptive controller keeps the physical and virtual input mismatch small, synchronization error remains bounded under moderate packet loss and latency. Empirically, SERN reduces end-to-end message latency by 15% to 25% and processing load by about 15% compared with a standard ROS setup, while maintaining tight real-virtual alignment with less than 5 cm positional error and less than 2 degrees rotational error. In a navigation task, SERN achieves a 95% success rate, compared with 85% for a real-only setup and 70% for a simulation-only setup, while also requiring fewer interventions and less time to reach the goal. These results show that a simulation-enhanced cross-reality stack can improve situational awareness and multi-agent coordination in contested environments by enabling look-ahead planning in the virtual twin while using real sensor feedback to correct discrepancies.

CVApr 13, 2025Code
A Survey on Efficient Vision-Language Models

Gaurav Shinde, Anuradha Ravi, Emon Dey et al.

Vision-language models (VLMs) integrate visual and textual information, enabling a wide range of applications such as image captioning and visual question answering, making them crucial for modern AI systems. However, their high computational demands pose challenges for real-time applications. This has led to a growing focus on developing efficient vision language models. In this survey, we review key techniques for optimizing VLMs on edge and resource-constrained devices. We also explore compact VLM architectures, frameworks and provide detailed insights into the performance-memory trade-offs of efficient VLMs. Furthermore, we establish a GitHub repository at https://github.com/MPSCUMBC/Efficient-Vision-Language-Models-A-Survey to compile all surveyed papers, which we will actively update. Our objective is to foster deeper research in this area.

CVMar 19, 2025Code
CAM-Seg: A Continuous-valued Embedding Approach for Semantic Image Generation

Masud Ahmed, Zahid Hasan, Syed Arefinul Haque et al.

Traditional transformer-based semantic segmentation relies on quantized embeddings. However, our analysis reveals that autoencoder accuracy on segmentation mask using quantized embeddings (e.g. VQ-VAE) is 8% lower than continuous-valued embeddings (e.g. KL-VAE). Motivated by this, we propose a continuous-valued embedding framework for semantic segmentation. By reformulating semantic mask generation as a continuous image-to-embedding diffusion process, our approach eliminates the need for discrete latent representations while preserving fine-grained spatial and semantic details. Our key contribution includes a diffusion-guided autoregressive transformer that learns a continuous semantic embedding space by modeling long-range dependencies in image features. Our framework contains a unified architecture combining a VAE encoder for continuous feature extraction, a diffusion-guided transformer for conditioned embedding generation, and a VAE decoder for semantic mask reconstruction. Our setting facilitates zero-shot domain adaptation capabilities enabled by the continuity of the embedding space. Experiments across diverse datasets (e.g., Cityscapes and domain-shifted variants) demonstrate state-of-the-art robustness to distribution shifts, including adverse weather (e.g., fog, snow) and viewpoint variations. Our model also exhibits strong noise resilience, achieving robust performance ($\approx$ 95% AP compared to baseline) under gaussian noise, moderate motion blur, and moderate brightness/contrast variations, while experiencing only a moderate impact ($\approx$ 90% AP compared to baseline) from 50% salt and pepper noise, saturation and hue shifts. Code available: https://github.com/mahmed10/CAMSS.git

ROMar 11
COHORT: Hybrid RL for Collaborative Large DNN Inference on Multi-Robot Systems Under Real-Time Constraints

Mohammad Saeid Anwar, Anuradha Ravi, Indrajeet Ghosh et al.

Large deep neural networks (DNNs), especially transformer-based and multimodal architectures, are computationally demanding and challenging to deploy on resource-constrained edge platforms like field robots. These challenges intensify in mission-critical scenarios (e.g., disaster response), where robots must collaborate under tight constraints on bandwidth, latency, and battery life, often without infrastructure or server support. To address these limitations, we present COHORT, a collaborative DNN inference and task-execution framework for multi-robot systems built on the Robotic Operating System (ROS). COHORT employs a hybrid offline-online reinforcement learning (RL) strategy to dynamically schedule and distribute DNN module execution across robots. Our key contributions are threefold: (a) Offline RL policy learning combined with Advantage-Weighted Regression (AWR), trained on auction-based task allocation data from heterogeneous DNN workloads across distributed robots, (b) Online policy adaptation via Multi-Agent PPO (MAPPO), initialized from the offline policy and fine-tuned in real time, and (c) comprehensive evaluation of COHORT on vision-language model (VLM) inference tasks such as CLIP and SAM, analyzing scalability with increasing robot/workload and robustness under . We benchmark COHORT against genetic algorithms and multiple RL baselines. Experimental results demonstrate that COHORT reduces battery consumption by 15.4% and increases GPU utilization by 51.67%, while satisfying frame-rate and deadline constraints 2.55 times of the time.

ROFeb 6, 2024
TopoNav: Topological Navigation for Efficient Exploration in Sparse Reward Environments

Jumman Hossain, Abu-Zaher Faridee, Nirmalya Roy et al.

Autonomous robots exploring unknown environments face a significant challenge: navigating effectively without prior maps and with limited external feedback. This challenge intensifies in sparse reward environments, where traditional exploration techniques often fail. In this paper, we present TopoNav, a novel topological navigation framework that integrates active mapping, hierarchical reinforcement learning, and intrinsic motivation to enable efficient goal-oriented exploration and navigation in sparse-reward settings. TopoNav dynamically constructs a topological map of the environment, capturing key locations and pathways. A two-level hierarchical policy architecture, comprising a high-level graph traversal policy and low-level motion control policies, enables effective navigation and obstacle avoidance while maintaining focus on the overall goal. Additionally, TopoNav incorporates intrinsic motivation to guide exploration toward relevant regions and frontier nodes in the topological map, addressing the challenges of sparse extrinsic rewards. We evaluate TopoNav both in the simulated and real-world off-road environments using a Clearpath Jackal robot, across three challenging navigation scenarios: goal-reaching, feature-based navigation, and navigation in complex terrains. We observe an increase in exploration coverage by 7- 20%, in success rates by 9-19%, and reductions in navigation times by 15-36% across various scenarios, compared to state-of-the-art methods

ROMar 29, 2024
EnCoMP: Enhanced Covert Maneuver Planning with Adaptive Threat-Aware Visibility Estimation using Offline Reinforcement Learning

Jumman Hossain, Abu-Zaher Faridee, Nirmalya Roy et al.

Autonomous robots operating in complex environments face the critical challenge of identifying and utilizing environmental cover for covert navigation to minimize exposure to potential threats. We propose EnCoMP, an enhanced navigation framework that integrates offline reinforcement learning and our novel Adaptive Threat-Aware Visibility Estimation (ATAVE) algorithm to enable robots to navigate covertly and efficiently in diverse outdoor settings. ATAVE is a dynamic probabilistic threat modeling technique that we designed to continuously assess and mitigate potential threats in real-time, enhancing the robot's ability to navigate covertly by adapting to evolving environmental and threat conditions. Moreover, our approach generates high-fidelity multi-map representations, including cover maps, potential threat maps, height maps, and goal maps from LiDAR point clouds, providing a comprehensive understanding of the environment. These multi-maps offer detailed environmental insights, helping in strategic navigation decisions. The goal map encodes the relative distance and direction to the target location, guiding the robot's navigation. We train a Conservative Q-Learning (CQL) model on a large-scale dataset collected from real-world environments, learning a robust policy that maximizes cover utilization, minimizes threat exposure, and maintains efficient navigation. We demonstrate our method's capabilities on a physical Jackal robot, showing extensive experiments across diverse terrains. These experiments demonstrate EnCoMP's superior performance compared to state-of-the-art methods, achieving a 95% success rate, 85% cover utilization, and reducing threat exposure to 10.5%, while significantly outperforming baselines in navigation efficiency and robustness.

RODec 13, 2023
Enhancing Robotic Navigation: An Evaluation of Single and Multi-Objective Reinforcement Learning Strategies

Vicki Young, Jumman Hossain, Nirmalya Roy

This study presents a comparative analysis between single-objective and multi-objective reinforcement learning methods for training a robot to navigate effectively to an end goal while efficiently avoiding obstacles. Traditional reinforcement learning techniques, namely Deep Q-Network (DQN), Deep Deterministic Policy Gradient (DDPG), and Twin Delayed DDPG (TD3), have been evaluated using the Gazebo simulation framework in a variety of environments with parameters such as random goal and robot starting locations. These methods provide a numerical reward to the robot, offering an indication of action quality in relation to the goal. However, their limitations become apparent in complex settings where multiple, potentially conflicting, objectives are present. To address these limitations, we propose an approach employing Multi-Objective Reinforcement Learning (MORL). By modifying the reward function to return a vector of rewards, each pertaining to a distinct objective, the robot learns a policy that effectively balances the different goals, aiming to achieve a Pareto optimal solution. This comparative study highlights the potential for MORL in complex, dynamic robotic navigation tasks, setting the stage for future investigations into more adaptable and robust robotic behaviors.

CVOct 23, 2024
Unsupervised Domain Adaptation for Action Recognition via Self-Ensembling and Conditional Embedding Alignment

Indrajeet Ghosh, Garvit Chugh, Abu Zaher Md Faridee et al.

Recent advancements in deep learning-based wearable human action recognition (wHAR) have improved the capture and classification of complex motions, but adoption remains limited due to the lack of expert annotations and domain discrepancies from user variations. Limited annotations hinder the model's ability to generalize to out-of-distribution samples. While data augmentation can improve generalizability, unsupervised augmentation techniques must be applied carefully to avoid introducing noise. Unsupervised domain adaptation (UDA) addresses domain discrepancies by aligning conditional distributions with labeled target samples, but vanilla pseudo-labeling can lead to error propagation. To address these challenges, we propose $μ$DAR, a novel joint optimization architecture comprised of three functions: (i) consistency regularizer between augmented samples to improve model classification generalizability, (ii) temporal ensemble for robust pseudo-label generation and (iii) conditional distribution alignment to improve domain generalizability. The temporal ensemble works by aggregating predictions from past epochs to smooth out noisy pseudo-label predictions, which are then used in the conditional distribution alignment module to minimize kernel-based class-wise conditional maximum mean discrepancy ($k$CMMD) between the source and target feature space to learn a domain invariant embedding. The consistency-regularized augmentations ensure that multiple augmentations of the same sample share the same labels; this results in (a) strong generalization with limited source domain samples and (b) consistent pseudo-label generation in target samples. The novel integration of these three modules in $μ$DAR results in a range of $\approx$ 4-12% average macro-F1 score improvement over six state-of-the-art UDA methods in four benchmark wHAR datasets

ROOct 22, 2024
QuasiNav: Asymmetric Cost-Aware Navigation Planning with Constrained Quasimetric Reinforcement Learning

Jumman Hossain, Abu-Zaher Faridee, Derrik Asher et al.

Autonomous navigation in unstructured outdoor environments is inherently challenging due to the presence of asymmetric traversal costs, such as varying energy expenditures for uphill versus downhill movement. Traditional reinforcement learning methods often assume symmetric costs, which can lead to suboptimal navigation paths and increased safety risks in real-world scenarios. In this paper, we introduce QuasiNav, a novel reinforcement learning framework that integrates quasimetric embeddings to explicitly model asymmetric costs and guide efficient, safe navigation. QuasiNav formulates the navigation problem as a constrained Markov decision process (CMDP) and employs quasimetric embeddings to capture directionally dependent costs, allowing for a more accurate representation of the terrain. This approach is combined with adaptive constraint tightening within a constrained policy optimization framework to dynamically enforce safety constraints during learning. We validate QuasiNav across three challenging navigation scenarios-undulating terrains, asymmetric hill traversal, and directionally dependent terrain traversal-demonstrating its effectiveness in both simulated and real-world environments. Experimental results show that QuasiNav significantly outperforms conventional methods, achieving higher success rates, improved energy efficiency, and better adherence to safety constraints.

DCMay 5, 2023
HeteroEdge: Addressing Asymmetry in Heterogeneous Collaborative Autonomous Systems

Mohammad Saeid Anwar, Emon Dey, Maloy Kumar Devnath et al.

Gathering knowledge about surroundings and generating situational awareness for IoT devices is of utmost importance for systems developed for smart urban and uncontested environments. For example, a large-area surveillance system is typically equipped with multi-modal sensors such as cameras and LIDARs and is required to execute deep learning algorithms for action, face, behavior, and object recognition. However, these systems face power and memory constraints due to their ubiquitous nature, making it crucial to optimize data processing, deep learning algorithm input, and model inference communication. In this paper, we propose a self-adaptive optimization framework for a testbed comprising two Unmanned Ground Vehicles (UGVs) and two NVIDIA Jetson devices. This framework efficiently manages multiple tasks (storage, processing, computation, transmission, inference) on heterogeneous nodes concurrently. It involves compressing and masking input image frames, identifying similar frames, and profiling devices to obtain boundary conditions for optimization.. Finally, we propose and optimize a novel parameter split-ratio, which indicates the proportion of the data required to be offloaded to another device while considering the networking bandwidth, busy factor, memory (CPU, GPU, RAM), and power constraints of the devices in the testbed. Our evaluations captured while executing multiple tasks (e.g., PoseNet, SegNet, ImageNet, DetectNet, DepthNet) simultaneously, reveal that executing 70% (split-ratio=70%) of the data on the auxiliary node minimizes the offloading latency by approx. 33% (18.7 ms/image to 12.5 ms/image) and the total operation time by approx. 47% (69.32s to 36.43s) compared to the baseline configuration (executing on the primary node).

CVMay 3, 2023
A Systematic Study on Object Recognition Using Millimeter-wave Radar

Maloy Kumar Devnath, Avijoy Chakma, Mohammad Saeid Anwar et al.

Due to its light and weather-independent sensing, millimeter-wave (MMW) radar is essential in smart environments. Intelligent vehicle systems and industry-grade MMW radars have integrated such capabilities. Industry-grade MMW radars are expensive and hard to get for community-purpose smart environment applications. However, commercially available MMW radars have hidden underpinning challenges that need to be investigated for tasks like recognizing objects and activities, real-time person tracking, object localization, etc. Image and video data are straightforward to gather, understand, and annotate for such jobs. Image and video data are light and weather-dependent, susceptible to the occlusion effect, and present privacy problems. To eliminate dependence and ensure privacy, commercial MMW radars should be tested. MMW radar's practicality and performance in varied operating settings must be addressed before promoting it. To address the problems, we collected a dataset using Texas Instruments' Automotive mmWave Radar (AWR2944) and reported the best experimental settings for object recognition performance using different deep learning algorithms. Our extensive data gathering technique allows us to systematically explore and identify object identification task problems under cross-ambience conditions. We investigated several solutions and published detailed experimental data.

HCMar 17, 2020
AutoCogniSys: IoT Assisted Context-Aware Automatic Cognitive Health Assessment

Mohammad Arif Ul Alam, Nirmalya Roy, Sarah Holmes et al.

Cognitive impairment has become epidemic in older adult population. The recent advent of tiny wearable and ambient devices, a.k.a Internet of Things (IoT) provides ample platforms for continuous functional and cognitive health assessment of older adults. In this paper, we design, implement and evaluate AutoCogniSys, a context-aware automated cognitive health assessment system, combining the sensing powers of wearable physiological (Electrodermal Activity, Photoplethysmography) and physical (Accelerometer, Object) sensors in conjunction with ambient sensors. We design appropriate signal processing and machine learning techniques, and develop an automatic cognitive health assessment system in a natural older adults living environment. We validate our approaches using two datasets: (i) a naturalistic sensor data streams related to Activities of Daily Living and mental arousal of 22 older adults recruited in a retirement community center, individually living in their own apartments using a customized inexpensive IoT system (IRB #HP-00064387) and (ii) a publicly available dataset for emotion detection. The performance of AutoCogniSys attests max. 93\% of accuracy in assessing cognitive health of older adults.

CLFeb 10, 2020
Localized Flood DetectionWith Minimal Labeled Social Media Data Using Transfer Learning

Neha Singh, Nirmalya Roy, Aryya Gangopadhyay

Social media generates an enormous amount of data on a daily basis but it is very challenging to effectively utilize the data without annotating or labeling it according to the target application. We investigate the problem of localized flood detection using the social sensing model (Twitter) in order to provide an efficient, reliable and accurate flood text classification model with minimal labeled data. This study is important since it can immensely help in providing the flood-related updates and notifications to the city officials for emergency decision making, rescue operations, and early warnings, etc. We propose to perform the text classification using the inductive transfer learning method i.e pre-trained language model ULMFiT and fine-tune it in order to effectively classify the flood-related feeds in any new location. Finally, we show that using very little new labeled data in the target domain we can successfully build an efficient and high performing model for flood detection and analysis with human-generated facts and observations from Twitter.

LGJan 9, 2019
Estimating Buildings' Parameters over Time Including Prior Knowledge

Nilavra Pathak, James Foulds, Nirmalya Roy et al.

Modeling buildings' heat dynamics is a complex process which depends on various factors including weather, building thermal capacity, insulation preservation, and residents' behavior. Gray-box models offer a causal inference of those dynamics expressed in few parameters specific to built environments. These parameters can provide compelling insights into the characteristics of building artifacts and have various applications such as forecasting HVAC usage, indoor temperature control monitoring of built environments, etc. In this paper, we present a systematic study of modeling buildings' thermal characteristics and thus derive the parameters of built conditions with a Bayesian approach. We build a Bayesian state-space model that can adapt and incorporate buildings' thermal equations and propose a generalized solution that can easily adapt prior knowledge regarding the parameters. We show that a faster approximate approach using variational inference for parameter estimation can provide similar parameters as that of a more time-consuming Markov Chain Monte Carlo (MCMC) approach. We perform extensive evaluations on two datasets to understand the generative process and show that the Bayesian approach is more interpretable. We further study the effects of prior selection for the model parameters and transfer learning, where we learn parameters from one season and use them to fit the model in the other. We perform extensive evaluations on controlled and real data traces to enumerate buildings' parameter within a 95% credible interval.