Joydeep Chandra

LG
h-index12
18papers
36citations
Novelty50%
AI Score54

18 Papers

DBMay 22
CHRONOS: Temporally-Aware Multi-Agent Coordination for Evolving Data Marketplaces

Joydeep Chandra

Temporal knowledge-graph data marketplaces face three coupled failures in static designs: stale hybrid index shortcuts reduce recall as edges evolve, stationary Shapley pricing misattributes value after distribution shifts, and uncoordinated agents over-consume a shared differential-privacy budget. We present CHRONOS, a three-layer architecture providing a unified treatment of these challenges with explicit public and private separation. Layer one applies neural-ODE temporal decay to shortcut edges, providing a per-query expected recall-loss bound of Big-O of Pq lambda delta t, with a monotone-envelope guarantee reducing bound looseness to 1.8 to 3.2 times observed loss. Layer two conditions Shapley valuation on detected changepoints and provides finite-sample error guarantees under noise. Layer three uses EXP3-IX to achieve Big-O of the square root of T log T regret while enforcing epsilon and delta differential privacy via moments accounting. CHRONOS releases a privatized affinity matrix per epoch using the Gaussian mechanism; all retrieval and ranking are post-processing, incurring no extra privacy cost. We provide multi-epoch settlement, scalability analysis for 500 sellers, and comparisons against accelerated baselines. Across four benchmarks, CHRONOS shows 0.937 recall at ten, 2.74 queries per second, 161 ms latency, and total epsilon of 4.25 at delta of 10 to the power of negative 6 under zCDP composition. These results indicate a competitive operating point. A limitation is that at this privacy level, released valuations remain noise-dominated; utility derives primarily from public index routing and adaptive scheduling driven by low-sensitivity statistics.

HCFeb 26Code
TherapyProbe: Generating Design Knowledge for Relational Safety in Mental Health Chatbots Through Adversarial Simulation

Joydeep Chandra, Satyam Kumar Navneet, Yong Zhang

As mental health chatbots proliferate to address the global treatment gap, a critical question emerges: How do we design for relational safety the quality of interaction patterns that unfold across conversations rather than the correctness of individual responses? Current safety evaluations assess single-turn crisis responses, missing the therapeutic dynamics that determine whether chatbots help or harm over time. We introduce TherapyProbe, a design probe methodology that generates actionable design knowledge by systematically exploring chatbot conversation trajectories through adversarial multi-agent simulation. Using open-source models, TherapyProbe surfaces relational safety failures interaction patterns like "validation spirals" where chatbots progressively reinforce hopelessness, or "empathy fatigue" where responses become mechanical over turns. Our contribution is translating these failures into a Safety Pattern Library of 23 failure archetypes with corresponding design recommendations. We contribute: (1) a replicable methodology requiring no API costs, (2) a clinically-grounded failure taxonomy, and (3) design implications for developers, clinicians, and policymakers.

DBMay 3
Efficient and Scalable Self-Healing Databases Using Meta-Learning and Dependency-Driven Recovery

Joydeep Chandra, Prabal Manhas

Modern database management systems (DBMS) face significant challenges in maintaining performance and availability under dynamic workloads. This paper proposes a novel self-healing framework that integrates Model-Agnostic Meta-Learning (MAML) for few-shot anomaly detection, Graph Neural Networks (GNNs) for dependency-driven cascading failure prediction, and multi-objective Reinforcement Learning (RL) for autonomous recovery. Unlike existing database tuning systems that focus primarily on offline configuration optimization, our framework enables real-time, end-to-end self-healing by rapidly adapting to unseen workload patterns with minimal labeled data. We introduce dynamic GNN-based dependency modeling that captures workload-dependent relationships between database components, enabling proactive cascade prevention. A scalarized multi-objective RL formulation balances latency, resource utilization, and cost during recovery, while SHAP-based explainability ensures operational transparency. Evaluations on Google Cluster Data and TPC benchmarks demonstrate 90.5\% anomaly detection F1-score with 5-shot adaptation, 90.1\% cascade prediction accuracy, and 85.1\% latency reduction in recovery actions, outperforming strong baselines including Isolation Forest, LSTM autoencoders, static GCN, and standard RL methods.

IRJun 5, 2025Code
TrueGL: A Truthful, Reliable, and Unified Engine for Grounded Learning in Full-Stack Search

Joydeep Chandra, Aleksandr Algazinov, Satyam Kumar Navneet et al.

In the age of open and free information, a concerning trend of reliance on AI is emerging. However, existing AI tools struggle to evaluate the credibility of information and to justify their assessments. Hence, there is a growing need for systems that can help users evaluate the trustworthiness of online information. Although major search engines incorporate AI features, they often lack clear reliability indicators. We present TrueGL, a model that makes trustworthy search results more accessible. The model is a fine-tuned version of IBM's Granite-1B, trained on the custom dataset and integrated into a search engine with a reliability scoring system. We evaluate the system using prompt engineering and assigning each statement a continuous reliability score from 0.1 to 1, then instructing the model to return a textual explanation alongside the score. Each model's predicted scores are measured against real scores using standard evaluation metrics. TrueGL consistently outperforms other small-scale LLMs and rule-based approaches across all experiments on key evaluation metrics, including MAE, RMSE, and R2. The model's high accuracy, broad content coverage, and ease of use make trustworthy information more accessible and help reduce the spread of false or misleading content online. Our code is publicly available at https://github.com/AlgazinovAleksandr/TrueGL, and our model is publicly released at https://huggingface.co/JoydeepC/trueGL.

HCOct 21, 2025Code
Plural Voices, Single Agent: Towards Inclusive AI in Multi-User Domestic Spaces

Joydeep Chandra, Satyam Kumar Navneet

Domestic AI agents faces ethical, autonomy, and inclusion challenges, particularly for overlooked groups like children, elderly, and Neurodivergent users. We present the Plural Voices Model (PVM), a novel single-agent framework that dynamically negotiates multi-user needs through real-time value alignment, leveraging diverse public datasets on mental health, eldercare, education, and moral reasoning. Using human+synthetic curriculum design with fairness-aware scenarios and ethical enhancements, PVM identifies core values, conflicts, and accessibility requirements to inform inclusive principles. Our privacy-focused prototype features adaptive safety scaffolds, tailored interactions (e.g., step-by-step guidance for Neurodivergent users, simple wording for children), and equitable conflict resolution. In preliminary evaluations, PVM outperforms multi-agent baselines in compliance (76% vs. 70%), fairness (90% vs. 85%), safety-violation rate (0% vs. 7%), and latency. Design innovations, including video guidance, autonomy sliders, family hubs, and adaptive safety dashboards, demonstrate new directions for ethical and inclusive domestic AI, for building user-centered agentic systems in plural domestic contexts. Our Codes and Model are been open sourced, available for reproduction: https://github.com/zade90/Agora

LGMay 4
Mixed-Precision Information Bottlenecks for On-Device Trait-State Disentanglement in Bipolar Agitation Detection

Joydeep Chandra

Continuous monitoring of bipolar disorder agitation via voice biomarkers requires disentangling stable speaker traits from volatile affective states on resource-constrained edge devices. We introduce MP-IB, the first framework to treat mixed-precision quantization as an information bottleneck for clinical trait-state separation. The core insight is that numerical precision itself controls capacity: an FP16 trait head (1,024 bits) encodes speaker identity, while an INT4 state head (128 bits) captures agitation, yielding 8x information asymmetry without adversarial training. We augment this with Dynamic Precision Scheduling and Multi-Scale Temporal Fusion. On Bridge2AI-Voice (N=833, 4 sessions/participant, strict speaker-independent CV), MP-IB achieves rho = 0.117 (95\% CI: [0.089, 0.145], p=0.003 vs. chance), outperforming 94M-parameter WavLM-Adapter with in-domain SSL continuation (rho = -0.042), beta VAE disentanglement (rho = 0.089), and hand-crafted prosody (rho = 0.031) by 2.8--15.9 points absolute. Zero-shot transfer to CREMA-D achieves AUC=0.817. Identity leakage is suppressed to near-random (EER=0.42, MIA-AUC=0.52). End-to-end latency is 23.4 ms with a 617 KB footprint, enabling real-time monitoring on sub 20 dollar devices.

HCFeb 25
When AI Writes, Whose Voice Remains? Quantifying Cultural Marker Erasure Across World English Varieties in Large Language Models

Satyam Kumar Navneet, Joydeep Chandra, Yong Zhang

Large Language Models (LLMs) are increasingly used to ``professionalize'' workplace communication, often at the cost of linguistic identity. We introduce "Cultural Ghosting", the systematic erasure of linguistic markers unique to non-native English varieties during text processing. Through analysis of 22,350 LLM outputs generated from 1,490 culturally marked texts (Indian, Singaporean,& Nigerian English) processed by five models under three prompt conditions, we quantify this phenomenon using two novel metrics: Identity Erasure Rate (IER) & Semantic Preservation Score (SPS). Across all prompts, we find an overall IER of 10.26%, with model-level variation from 3.5% to 20.5% (5.9x range). Crucially, we identify a Semantic Preservation Paradox: models maintain high semantic similarity (mean SPS = 0.748) while systematically erasing cultural markers. Pragmatic markers (politeness conventions) are 1.9x more vulnerable than lexical markers (71.5% vs. 37.1% erasure). Our experiments demonstrate that explicit cultural-preservation prompts reduce erasure by 29% without sacrificing semantic quality.

LGFeb 4
Safe Urban Traffic Control via Uncertainty-Aware Conformal Prediction and World-Model Reinforcement Learning

Joydeep Chandra, Satyam Kumar Navneet, Aleksandr Algazinov et al.

Urban traffic management demands systems that simultaneously predict future conditions, detect anomalies, and take safe corrective actions -- all while providing reliability guarantees. We present STREAM-RL, a unified framework that introduces three novel algorithmic contributions: (1) PU-GAT+, an Uncertainty-Guided Adaptive Conformal Forecaster that uses prediction uncertainty to dynamically reweight graph attention via confidence-monotonic attention, achieving distribution-free coverage guarantees; (2) CRFN-BY, a Conformal Residual Flow Network that models uncertainty-normalized residuals via normalizing flows with Benjamini-Yekutieli FDR control under arbitrary dependence; and (3) LyCon-WRL+, an Uncertainty-Guided Safe World-Model RL agent with Lyapunov stability certificates, certified Lipschitz bounds, and uncertainty-propagated imagination rollouts. To our knowledge, this is the first framework to propagate calibrated uncertainty from forecasting through anomaly detection to safe policy learning with end-to-end theoretical guarantees. Experiments on multiple real-world traffic trajectory data demonstrate that STREAM-RL achieves 91.4\% coverage efficiency, controls FDR at 4.1\% under verified dependence, and improves safety rate to 95.2\% compared to 69\% for standard PPO while achieving higher reward, with 23ms end-to-end inference latency.

LGAug 20, 2025
Aura-CAPTCHA: A Reinforcement Learning and GAN-Enhanced Multi-Modal CAPTCHA System

Joydeep Chandra, Prabal Manhas, Ramanjot Kaur et al.

Aura-CAPTCHA was developed as a multi-modal CAPTCHA system to address vulnerabilities in traditional methods that are increasingly bypassed by AI technologies, such as Optical Character Recognition (OCR) and adversarial image processing. The design integrated Generative Adversarial Networks (GANs) for generating dynamic image challenges, Reinforcement Learning (RL) for adaptive difficulty tuning, and Large Language Models (LLMs) for creating text and audio prompts. Visual challenges included 3x3 grid selections with at least three correct images, while audio challenges combined randomized numbers and words into a single task. RL adjusted difficulty based on incorrect attempts, response time, and suspicious user behavior. Evaluations on real-world traffic demonstrated a 92% human success rate and a 10% bot bypass rate, significantly outperforming existing CAPTCHA systems. The system provided a robust and scalable approach for securing online applications while remaining accessible to users, addressing gaps highlighted in previous research.

SEAug 15, 2025
Rethinking Autonomy: Preventing Failures in AI-Driven Software Engineering

Satyam Kumar Navneet, Joydeep Chandra

The integration of Large Language Models (LLMs) into software engineering has revolutionized code generation, enabling unprecedented productivity through promptware and autonomous AI agents. However, this transformation introduces significant risks, including insecure code generation, hallucinated outputs, irreversible actions, and a lack of transparency and accountability. Incidents like the Replit database deletion underscore the urgent need for robust safety and governance mechanisms. This paper comprehensively analyzes the inherent challenges of LLM-assisted code generation, such as vulnerability inheritance, overtrust, misinterpretation, and the absence of standardized validation and rollback protocols. To address these, we propose the SAFE-AI Framework, a holistic approach emphasizing Safety, Auditability, Feedback, and Explainability. The framework integrates guardrails, sandboxing, runtime verification, risk-aware logging, human-in-the-loop systems, and explainable AI techniques to mitigate risks while fostering trust and compliance. We introduce a novel taxonomy of AI behaviors categorizing suggestive, generative, autonomous, and destructive actions to guide risk assessment and oversight. Additionally, we identify open problems, including the lack of standardized benchmarks for code specific hallucinations and autonomy levels, and propose future research directions for hybrid verification, semantic guardrails, and proactive governance tools. Through detailed comparisons of autonomy control, prompt engineering, explainability, and governance frameworks, this paper provides a roadmap for responsible AI integration in software engineering, aligning with emerging regulations like the EU AI Act and Canada's AIDA to ensure safe, transparent, and accountable AI-driven development.

AIJul 21, 2025
Advancing Responsible Innovation in Agentic AI: A study of Ethical Frameworks for Household Automation

Joydeep Chandra, Satyam Kumar Navneet

The implementation of Artificial Intelligence (AI) in household environments, especially in the form of proactive autonomous agents, brings about possibilities of comfort and attention as well as it comes with intra or extramural ethical challenges. This article analyzes agentic AI and its applications, focusing on its move from reactive to proactive autonomy, privacy, fairness and user control. We review responsible innovation frameworks, human-centered design principles, and governance practices to distill practical guidance for ethical smart home systems. Vulnerable user groups such as elderly individuals, children, and neurodivergent who face higher risks of surveillance, bias, and privacy risks were studied in detail in context of Agentic AI. Design imperatives are highlighted such as tailored explainability, granular consent mechanisms, and robust override controls, supported by participatory and inclusive methodologies. It was also explored how data-driven insights, including social media analysis via Natural Language Processing(NLP), can inform specific user needs and ethical concerns. This survey aims to provide both a conceptual foundation and suggestions for developing transparent, inclusive, and trustworthy agentic AI in household automation.

DBOct 11, 2025
The Hybrid Multimodal Graph Index (HMGI): A Comprehensive Framework for Integrated Relational and Vector Search

Joydeep Chandra, Satyam Kumar Navneet, Yong Zhang

The proliferation of complex, multimodal datasets has exposed a critical gap between the capabilities of specialized vector databases and traditional graph databases. While vector databases excel at semantic similarity search, they lack the capacity for deep relational querying. Conversely, graph databases master complex traversals but are not natively optimized for high-dimensional vector search. This paper introduces the Hybrid Multimodal Graph Index (HMGI), a novel framework designed to bridge this gap by creating a unified system for efficient, hybrid queries on multimodal data. HMGI leverages the native graph database architecture and integrated vector search capabilities, exemplified by platforms like Neo4j, to combine Approximate Nearest Neighbor Search (ANNS) with expressive graph traversal queries. Key innovations of the HMGI framework include modality-aware partitioning of embeddings to optimize index structure and query performance, and a system for adaptive, low-overhead index updates to support dynamic data ingestion, drawing inspiration from the architectural principles of systems like TigerVector. By integrating semantic similarity search directly with relational context, HMGI aims to outperform pure vector databases like Milvus in complex, relationship-heavy query scenarios and achieve sub-linear query times for hybrid tasks.

LGAug 5, 2025
NAEx: A Plug-and-Play Framework for Explaining Network Alignment

Shruti Saxena, Arijit Khan, Joydeep Chandra

Network alignment (NA) identifies corresponding nodes across multiple networks, with applications in domains like social networks, co-authorship, and biology. Despite advances in alignment models, their interpretability remains limited, making it difficult to understand alignment decisions and posing challenges in building trust, particularly in high-stakes domains. To address this, we introduce NAEx, a plug-and-play, model-agnostic framework that explains alignment models by identifying key subgraphs and features influencing predictions. NAEx addresses the key challenge of preserving the joint cross-network dependencies on alignment decisions by: (1) jointly parameterizing graph structures and feature spaces through learnable edge and feature masks, and (2) introducing an optimization objective that ensures explanations are both faithful to the original predictions and enable meaningful comparisons of structural and feature-based similarities between networks. NAEx is an inductive framework that efficiently generates NA explanations for previously unseen data. We introduce evaluation metrics tailored to alignment explainability and demonstrate NAEx's effectiveness and efficiency on benchmark datasets by integrating it with four representative NA models.

CRJul 26, 2025
Policy-Driven AI in Dataspaces: Taxonomy, Explainability, and Pathways for Compliant Innovation

Joydeep Chandra, Satyam Kumar Navneet

As AI-driven dataspaces become integral to data sharing and collaborative analytics, ensuring privacy, performance, and policy compliance presents significant challenges. This paper provides a comprehensive review of privacy-preserving and policy-aware AI techniques, including Federated Learning, Differential Privacy, Trusted Execution Environments, Homomorphic Encryption, and Secure Multi-Party Computation, alongside strategies for aligning AI with regulatory frameworks such as GDPR and the EU AI Act. We propose a novel taxonomy to classify these techniques based on privacy levels, performance impacts, and compliance complexity, offering a clear framework for practitioners and researchers to navigate trade-offs. Key performance metrics -- latency, throughput, cost overhead, model utility, fairness, and explainability -- are analyzed to highlight the multi-dimensional optimization required in dataspaces. The paper identifies critical research gaps, including the lack of standardized privacy-performance KPIs, challenges in explainable AI for federated ecosystems, and semantic policy enforcement amidst regulatory fragmentation. Future directions are outlined, proposing a conceptual framework for policy-driven alignment, automated compliance validation, standardized benchmarking, and integration with European initiatives like GAIA-X, IDS, and Eclipse EDC. By synthesizing technical, ethical, and regulatory perspectives, this work lays the groundwork for developing trustworthy, efficient, and compliant AI systems in dataspaces, fostering innovation in secure and responsible data-driven ecosystems.

DCJul 15, 2025
FLsim: A Modular and Library-Agnostic Simulation Framework for Federated Learning

Arnab Mukherjee, Raju Halder, Joydeep Chandra

Federated Learning (FL) has undergone significant development since its inception in 2016, advancing from basic algorithms to complex methodologies tailored to address diverse challenges and use cases. However, research and benchmarking of novel FL techniques against a plethora of established state-of-the-art solutions remain challenging. To streamline this process, we introduce FLsim, a comprehensive FL simulation framework designed to meet the diverse requirements of FL workflows in the literature. FLsim is characterized by its modularity, scalability, resource efficiency, and controlled reproducibility of experimental outcomes. Its easy to use interface allows users to specify customized FL requirements through job configuration, which supports: (a) customized data distributions, ranging from non-independent and identically distributed (non-iid) data to independent and identically distributed (iid) data, (b) selection of local learning algorithms according to user preferences, with complete agnosticism to ML libraries, (c) choice of network topology illustrating communication patterns among nodes, (d) definition of model aggregation and consensus algorithms, and (e) pluggable blockchain support for enhanced robustness. Through a series of experimental evaluations, we demonstrate the effectiveness and versatility of FLsim in simulating a diverse range of state-of-the-art FL experiments. We envisage that FLsim would mark a significant advancement in FL simulation frameworks, offering unprecedented flexibility and functionality for researchers and practitioners alike.

NIMay 30, 2025
INSIGHT: A Survey of In-Network Systems for Intelligent, High-Efficiency AI and Topology Optimization

Aleksandr Algazinov, Joydeep Chandra, Matt Laing

In-network computation represents a transformative approach to addressing the escalating demands of Artificial Intelligence (AI) workloads on network infrastructure. By leveraging the processing capabilities of network devices such as switches, routers, and Network Interface Cards (NICs), this paradigm enables AI computations to be performed directly within the network fabric, significantly reducing latency, enhancing throughput, and optimizing resource utilization. This paper provides a comprehensive analysis of optimizing in-network computation for AI, exploring the evolution of programmable network architectures, such as Software-Defined Networking (SDN) and Programmable Data Planes (PDPs), and their convergence with AI. It examines methodologies for mapping AI models onto resource-constrained network devices, addressing challenges like limited memory and computational capabilities through efficient algorithm design and model compression techniques. The paper also highlights advancements in distributed learning, particularly in-network aggregation, and the potential of federated learning to enhance privacy and scalability. Frameworks like Planter and Quark are discussed for simplifying development, alongside key applications such as intelligent network monitoring, intrusion detection, traffic management, and Edge AI. Future research directions, including runtime programmability, standardized benchmarks, and new applications paradigms, are proposed to advance this rapidly evolving field. This survey underscores the potential of in-network AI to create intelligent, efficient, and responsive networks capable of meeting the demands of next-generation AI applications.

SIJan 17, 2022
SigGAN : Adversarial Model for Learning Signed Relationships in Networks

Roshni Chakraborty, Ritwika Das, Joydeep Chandra

Signed link prediction in graphs is an important problem that has applications in diverse domains. It is a binary classification problem that predicts whether an edge between a pair of nodes is positive or negative. Existing approaches for link prediction in unsigned networks cannot be directly applied for signed link prediction due to their inherent differences. Further, additional structural constraints, like, the structural balance property of the signed networks must be considered for signed link prediction. Recent signed link prediction approaches generate node representations using either generative models or discriminative models. Inspired by the recent success of Generative Adversarial Network (GAN) based models which comprises of a discriminator and generator in several applications, we propose a Generative Adversarial Network (GAN) based model for signed networks, SigGAN. It considers the requirements of signed networks, such as, integration of information from negative edges, high imbalance in number of positive and negative edges and structural balance theory. Comparing the performance with state of the art techniques on several real-world datasets validates the effectiveness of SigGAN.

SIJan 27, 2019
Predicting Tomorrow's Headline using Today's Twitter Deliberations

Roshni Chakraborty, Abhijeet Kharat, Apalak Khatua et al.

Predicting the popularity of news article is a challenging task. Existing literature mostly focused on article contents and polarity to predict popularity. However, existing research has not considered the users' preference towards a particular article. Understanding users' preference is an important aspect for predicting the popularity of news articles. Hence, we consider the social media data, from the Twitter platform, to address this research gap. In our proposed model, we have considered the users' involvement as well as the users' reaction towards an article to predict the popularity of the article. In short, we are predicting tomorrow's headline by probing today's Twitter discussion. We have considered 300 political news article from the New York Post, and our proposed approach has outperformed other baseline models.