Bharat Bhargava

CV
h-index3
12papers
274citations
Novelty54%
AI Score48

12 Papers

CVJun 14, 2023
EMERSK -- Explainable Multimodal Emotion Recognition with Situational Knowledge

Mijanur Palash, Bharat Bhargava

Automatic emotion recognition has recently gained significant attention due to the growing popularity of deep learning algorithms. One of the primary challenges in emotion recognition is effectively utilizing the various cues (modalities) available in the data. Another challenge is providing a proper explanation of the outcome of the learning.To address these challenges, we present Explainable Multimodal Emotion Recognition with Situational Knowledge (EMERSK), a generalized and modular system for human emotion recognition and explanation using visual information. Our system can handle multiple modalities, including facial expressions, posture, and gait, in a flexible and modular manner. The network consists of different modules that can be added or removed depending on the available data. We utilize a two-stream network architecture with convolutional neural networks (CNNs) and encoder-decoder style attention mechanisms to extract deep features from face images. Similarly, CNNs and recurrent neural networks (RNNs) with Long Short-term Memory (LSTM) are employed to extract features from posture and gait data. We also incorporate deep features from the background as contextual information for the learning process. The deep features from each module are fused using an early fusion network. Furthermore, we leverage situational knowledge derived from the location type and adjective-noun pair (ANP) extracted from the scene, as well as the spatio-temporal average distribution of emotions, to generate explanations. Ablation studies demonstrate that each sub-network can independently perform emotion recognition, and combining them in a multimodal approach significantly improves overall recognition performance. Extensive experiments conducted on various benchmark datasets, including GroupWalk, validate the superior performance of our approach compared to other state-of-the-art methods.

35.5CRMar 23
Semi-Automated Threat Modeling of Cloud-Based Systems Through Extracting Software Architecture from Configuration and Network Flow

Nicholas Pecka, Lotfi Ben Othmane, Bharat Bhargava et al.

Traditional threat modeling occurs during design, but cloud deployments introduce unanticipated threats, especially multi-stage attacks chaining vulnerabilities across trust boundaries. Existing security tools analyze components in isolation, cannot detect architectural threats from system composition, and cannot validate runtime behavior against configured policies. This gap leaves organizations vulnerable to attacks exploiting architectural weaknesses. This paper addresses this gap through a key innovation: automatically inferring system architecture from runtime observations to enable continuous threat modeling. Our methodology combines static configuration analysis with observed network flows to construct architecture graphs reflecting actual operational behavior, then applies systematic threat detection using platform-agnostic abstractions (components, domains, interfaces, access policies, flows). This enables consistent threat identification across bare metal, Kubernetes, and cloud infrastructure without manual diagram maintenance. We validate the methodology using a supply-chain system with ML components deployed on all three platforms, injecting 17 infrastructure and ML threats. Results show detection of all 17 threat types across all platforms, while existing security tools detected only 6-47% with zero ML threat coverage, confirming the necessity of runtime aware, architecture-level threat analysis.

CVJun 14, 2023
SAFER: Situation Aware Facial Emotion Recognition

Mijanur Palash, Bharat Bhargava

In this paper, we present SAFER, a novel system for emotion recognition from facial expressions. It employs state-of-the-art deep learning techniques to extract various features from facial images and incorporates contextual information, such as background and location type, to enhance its performance. The system has been designed to operate in an open-world setting, meaning it can adapt to unseen and varied facial expressions, making it suitable for real-world applications. An extensive evaluation of SAFER against existing works in the field demonstrates improved performance, achieving an accuracy of 91.4% on the CAER-S dataset. Additionally, the study investigates the effect of novelty such as face masks during the Covid-19 pandemic on facial emotion recognition and critically examines the limitations of mainstream facial expressions datasets. To address these limitations, a novel dataset for facial emotion recognition is proposed. The proposed dataset and the system are expected to be useful for various applications such as human-computer interaction, security, and surveillance.

6.0LGMar 30
Pre-Deployment Complexity Estimation for Federated Perception Systems

KMA Solaiman, Shafkat Islam, Ruy de Oliveira et al.

Edge AI systems increasingly rely on federated learning to train perception models in distributed, privacy-preserving, and resource-constrained environments. Yet, before training begins, practitioners often lack practical tools to estimate how difficult a federated learning task will be in terms of achievable accuracy and communication cost. This paper presents a classifier-agnostic, pre-deployment framework for estimating learning complexity in federated perception systems by jointly modeling intrinsic properties of the data and characteristics of the distributed environment. The proposed complexity metric integrates dataset attributes such as dimensionality, sparsity, and heterogeneity with factors related to the composition of participating clients. Using federated learning as a representative distributed training setting, we examine how learning difficulty varies across different federated configurations. Experiments on multiple variants of the MNIST dataset and CIFAR dataset show that the proposed metric strongly correlates with federated learning performance and the communication effort required to reach fixed accuracy targets. These findings suggest that complexity estimation can serve as a practical diagnostic tool for resource planning, dataset assessment, and feasibility evaluation in edge-deployed perception systems.

CVJun 14, 2023
Continuous Learning Based Novelty Aware Emotion Recognition System

Mijanur Palash, Bharat Bhargava

Current works in human emotion recognition follow the traditional closed learning approach governed by rigid rules without any consideration of novelty. Classification models are trained on some collected datasets and expected to have the same data distribution in the real-world deployment. Due to the fluid and constantly changing nature of the world we live in, it is possible to have unexpected and novel sample distribution which can lead the model to fail. Hence, in this work, we propose a continuous learning based approach to deal with novelty in the automatic emotion recognition task.

IRJun 25, 2025
Multimodal Information Retrieval for Open World with Edit Distance Weak Supervision

KMA Solaiman, Bharat Bhargava

Existing multi-media retrieval models either rely on creating a common subspace with modality-specific representation models or require schema mapping among modalities to measure similarities among multi-media data. Our goal is to avoid the annotation overhead incurred from considering retrieval as a supervised classification task and re-use the pretrained encoders in large language models and vision tasks. We propose "FemmIR", a framework to retrieve multimodal results relevant to information needs expressed with multimodal queries by example without any similarity label. Such identification is necessary for real-world applications where data annotations are scarce and satisfactory performance is required without fine-tuning with a common framework across applications. We curate a new dataset called MuQNOL for benchmarking progress on this task. Our technique is based on weak supervision introduced through edit distance between samples: graph edit distance can be modified to consider the cost of replacing a data sample in terms of its properties, and relevance can be measured through the implicit signal from the amount of edit cost among the objects. Unlike metric learning or encoding networks, FemmIR re-uses the high-level properties and maintains the property value and relationship constraints with a multi-level interaction score between data samples and the query example provided by the user. We empirically evaluate FemmIR on a missing person use case with MuQNOL. FemmIR performs comparably to similar retrieval systems in delivering on-demand retrieval results with exact and approximate similarities while using the existing property identifiers in the system.

CRApr 8, 2021
Detection of Message Injection Attacks onto the CAN Bus using Similarity of Successive Messages-Sequence Graphs

Mubark Jedh, Lotfi ben Othmane, Noor Ahmed et al.

The smart features of modern cars are enabled by a number of Electronic Control Units (ECUs) components that communicate through an in-vehicle network, known as Controller Area Network (CAN) bus. The fundamental challenge is the security of the communication link where an attacker can inject messages (e.g., increase the speed) that may impact the safety of the driver. Developing an effective defensive security solution depends on the knowledge of the identity of the ECUs, which is proprietary information. This paper proposes a message injection attack detection mechanism that is independent of the IDs of the ECUs, which is achieved by capturing the patterns in the message sequences. First, we represent the sequencing ofther messages in a given time-interval as a direct graph and compute the similarities of the successive graphs using the cosine similarity and Pearson correlation. Then, we apply threshold, change point detection, and Long Short-Term Memory (LSTM)-Recurrent NeuralNetwork (RNN) to detect and predict malicious message injections into the CAN bus. The evaluation of the methods using a dataset collected from a moving vehicle under malicious RPM and speed reading message injections show a detection accuracy of 98.45% when using LSTM-RNN and 97.32% when using a threshold method. Further, the pace of detecting the change isfast for the case of injection of RPM reading messagesbut slow for the case of injection of speed readingsmessages.

AIApr 1, 2021
AdaPool: A Diurnal-Adaptive Fleet Management Framework using Model-Free Deep Reinforcement Learning and Change Point Detection

Marina Haliem, Vaneet Aggarwal, Bharat Bhargava

This paper introduces an adaptive model-free deep reinforcement approach that can recognize and adapt to the diurnal patterns in the ride-sharing environment with car-pooling. Deep Reinforcement Learning (RL) suffers from catastrophic forgetting due to being agnostic to the timescale of changes in the distribution of experiences. Although RL algorithms are guaranteed to converge to optimal policies in Markov decision processes (MDPs), this only holds in the presence of static environments. However, this assumption is very restrictive. In many real-world problems like ride-sharing, traffic control, etc., we are dealing with highly dynamic environments, where RL methods yield only sub-optimal decisions. To mitigate this problem in highly dynamic environments, we (1) adopt an online Dirichlet change point detection (ODCP) algorithm to detect the changes in the distribution of experiences, (2) develop a Deep Q Network (DQN) agent that is capable of recognizing diurnal patterns and making informed dispatching decisions according to the changes in the underlying environment. Rather than fixing patterns by time of week, the proposed approach automatically detects that the MDP has changed, and uses the results of the new model. In addition to the adaptation logic in dispatching, this paper also proposes a dynamic, demand-aware vehicle-passenger matching and route planning framework that dynamically generates optimal routes for each vehicle based on online demand, vehicle capacities, and locations. Evaluation on New York City Taxi public dataset shows the effectiveness of our approach in improving the fleet utilization, where less than 50% of the fleet are utilized to serve the demand of up to 90% of the requests, while maximizing profits and minimizing idle times.

LGMar 1, 2021
Decision Making in Monopoly using a Hybrid Deep Reinforcement Learning Approach

Trevor Bonjour, Marina Haliem, Aala Alsalem et al.

Learning to adapt and make real-time informed decisions in a dynamic and complex environment is a challenging problem. Monopoly is a popular strategic board game that requires players to make multiple decisions during the game. Decision-making in Monopoly involves many real-world elements such as strategizing, luck, and modeling of opponent's policies. In this paper, we present novel representations for the state and action space for the full version of Monopoly and define an improved reward function. Using these, we show that our deep reinforcement learning agent can learn winning strategies for Monopoly against different fixed-policy agents. In Monopoly, players can take multiple actions even if it is not their turn to roll the dice. Some of these actions occur more frequently than others, resulting in a skewed distribution that adversely affects the performance of the learning agent. To tackle the non-uniform distribution of actions, we propose a hybrid approach that combines deep reinforcement learning (for frequent but complex decisions) with a fixed policy approach (for infrequent but straightforward decisions). Experimental results show that our hybrid agent outperforms a standard deep reinforcement learning agent by 30% in the number of games won against fixed-policy agents.

AINov 17, 2020
PassGoodPool: Joint Passengers and Goods Fleet Management with Reinforcement Learning aided Pricing, Matching, and Route Planning

Kaushik Manchella, Marina Haliem, Vaneet Aggarwal et al.

The ubiquitous growth of mobility-on-demand services for passenger and goods delivery has brought various challenges and opportunities within the realm of transportation systems. As a result, intelligent transportation systems are being developed to maximize operational profitability, user convenience, and environmental sustainability. The growth of last mile deliveries alongside ridesharing calls for an efficient and cohesive system that transports both passengers and goods. Existing methods address this using static routing methods considering neither the demands of requests nor the transfer of goods between vehicles during route planning. In this paper, we present a dynamic and demand aware fleet management framework for combined goods and passenger transportation that is capable of (1) Involving both passengers and drivers in the decision-making process by allowing drivers to negotiate to a mutually suitable price, and passengers to accept/reject, (2) Matching of goods to vehicles, and the multi-hop transfer of goods, (3) Dynamically generating optimal routes for each vehicle considering demand along their paths, based on the insertion cost which then determines the matching, (4) Dispatching idle vehicles to areas of anticipated high passenger and goods demand using Deep Reinforcement Learning (RL), (5) Allowing for distributed inference at each vehicle while collectively optimizing fleet objectives. Our proposed model is deployable independently within each vehicle as this minimizes computational costs associated with the growth of distributed systems and democratizes decision-making to each individual. Simulations on a variety of vehicle types, goods, and passenger utility functions show the effectiveness of our approach as compared to other methods that do not consider combined load transportation or dynamic multi-hop route planning.

MAOct 5, 2020
A Distributed Model-Free Ride-Sharing Approach for Joint Matching, Pricing, and Dispatching using Deep Reinforcement Learning

Marina Haliem, Ganapathy Mani, Vaneet Aggarwal et al.

Significant development of ride-sharing services presents a plethora of opportunities to transform urban mobility by providing personalized and convenient transportation while ensuring efficiency of large-scale ride pooling. However, a core problem for such services is route planning for each driver to fulfill the dynamically arriving requests while satisfying given constraints. Current models are mostly limited to static routes with only two rides per vehicle (optimally) or three (with heuristics). In this paper, we present a dynamic, demand aware, and pricing-based vehicle-passenger matching and route planning framework that (1) dynamically generates optimal routes for each vehicle based on online demand, pricing associated with each ride, vehicle capacities and locations. This matching algorithm starts greedily and optimizes over time using an insertion operation, (2) involves drivers in the decision-making process by allowing them to propose a different price based on the expected reward for a particular ride as well as the destination locations for future rides, which is influenced by supply-and demand computed by the Deep Q-network, (3) allows customers to accept or reject rides based on their set of preferences with respect to pricing and delay windows, vehicle type and carpooling preferences, and (4) based on demand prediction, our approach re-balances idle vehicles by dispatching them to the areas of anticipated high demand using deep Reinforcement Learning (RL). Our framework is validated using the New York City Taxi public dataset; however, we consider different vehicle types and designed customer utility functions to validate the setup and study different settings. Experimental results show the effectiveness of our approach in real-time and large scale settings.

CVJul 1, 2020
ConFoc: Content-Focus Protection Against Trojan Attacks on Neural Networks

Miguel Villarreal-Vasquez, Bharat Bhargava

Deep Neural Networks (DNNs) have been applied successfully in computer vision. However, their wide adoption in image-related applications is threatened by their vulnerability to trojan attacks. These attacks insert some misbehavior at training using samples with a mark or trigger, which is exploited at inference or testing time. In this work, we analyze the composition of the features learned by DNNs at training. We identify that they, including those related to the inserted triggers, contain both content (semantic information) and style (texture information), which are recognized as a whole by DNNs at testing time. We then propose a novel defensive technique against trojan attacks, in which DNNs are taught to disregard the styles of inputs and focus on their content only to mitigate the effect of triggers during the classification. The generic applicability of the approach is demonstrated in the context of a traffic sign and a face recognition application. Each of them is exposed to a different attack with a variety of triggers. Results show that the method reduces the attack success rate significantly to values < 1% in all the tested attacks while keeping as well as improving the initial accuracy of the models when processing both benign and adversarial data.