Rajeev Sahay

LG
h-index22
9papers
65citations
Novelty48%
AI Score42

9 Papers

49.5SIApr 20
Spatiotemporal Link Formation Prediction in Social Learning Networks Using Graph Neural Networks

Ali Mohammadiasl, Bita Akram, Seyyedali Hosseinalipour et al.

Social learning networks (SLNs) are graphical representations that capture student interactions within educational settings (e.g., a classroom), with nodes representing students and edges denoting interactions. Accurately predicting future interactions in these networks (i.e., link prediction) is crucial for enabling effective collaborative learning, supporting timely instructional interventions, and informing the design of effective group-based learning activities. However, traditional link prediction approaches are typically tuned to general online social networks (OSNs), often overlooking the complex, non-Euclidean, and dynamically evolving structure of SLNs, thus limiting their effectiveness in educational settings. In this work, we propose a graph neural network (GNN) framework that jointly considers the temporal evolution within classrooms and spatial aggregation across classrooms to perform link prediction in SLNs. Specifically, we analyze link prediction performance of GNNs over the SLNs of four distinct classrooms across their (i) temporal evolutions (varying time instances), (ii) spatial aggregations (joint SLN analysis), and (iii) varying spatial aggregations at varying temporal evolutions throughout the course. Our results indicate statistically significant performance improvements in the prediction of future links as the courses progress temporally. Aggregating SLNs from multiple classrooms generally enhances model performance as well, especially in sparser datasets. Moreover, we find that jointly leveraging both the temporal evolution and spatial aggregation of SLNs significantly outperforms conventional baseline approaches that analyze classrooms in isolation. Our findings demonstrate the efficacy of educationally meaningful link predictions, with direct implications for early-course decision-making and scalable learning analytics in and across classroom settings.

CYJan 20, 2025
The Transition from Centralized Machine Learning to Federated Learning for Mental Health in Education: A Survey of Current Methods and Future Directions

Maryam Ebrahimi, Rajeev Sahay, Seyyedali Hosseinalipour et al.

Research has increasingly explored the application of artificial intelligence (AI) and machine learning (ML) within the mental health domain to enhance both patient care and healthcare provider efficiency. Given that mental health challenges frequently emerge during early adolescence -- the critical years of high school and college -- investigating AI/ML-driven mental health solutions within the education domain is of paramount importance. Nevertheless, conventional AI/ML techniques follow a centralized model training architecture, which poses privacy risks due to the need for transferring students' sensitive data from institutions, universities, and clinics to central servers. Federated learning (FL) has emerged as a solution to address these risks by enabling distributed model training while maintaining data privacy. Despite its potential, research on applying FL to analyze students' mental health remains limited. In this paper, we aim to address this limitation by proposing a roadmap for integrating FL into mental health data analysis within educational settings. We begin by providing an overview of mental health issues among students and reviewing existing studies where ML has been applied to address these challenges. Next, we examine broader applications of FL in the mental health domain to emphasize the lack of focus on educational contexts. Finally, we propose promising research directions focused on using FL to address mental health issues in the education sector, which entails discussing the synergies between the proposed directions with broader human-centered domains. By categorizing the proposed research directions into short- and long-term strategies and highlighting the unique challenges at each stage, we aim to encourage the development of privacy-conscious AI/ML-driven mental health solutions.

LGSep 9, 2025
Bringing Multi-Modal Multi-Task Federated Foundation Models to Education Domain: Prospects and Challenges

Kasra Borazjani, Naji Khosravan, Rajeev Sahay et al.

Multi-modal multi-task (M3T) foundation models (FMs) have recently shown transformative potential in artificial intelligence, with emerging applications in education. However, their deployment in real-world educational settings is hindered by privacy regulations, data silos, and limited domain-specific data availability. We introduce M3T Federated Foundation Models (FedFMs) for education: a paradigm that integrates federated learning (FL) with M3T FMs to enable collaborative, privacy-preserving training across decentralized institutions while accommodating diverse modalities and tasks. Subsequently, this position paper aims to unveil M3T FedFMs as a promising yet underexplored approach to the education community, explore its potentials, and reveal its related future research directions. We outline how M3T FedFMs can advance three critical pillars of next-generation intelligent education systems: (i) privacy preservation, by keeping sensitive multi-modal student and institutional data local; (ii) personalization, through modular architectures enabling tailored models for students, instructors, and institutions; and (iii) equity and inclusivity, by facilitating participation from underrepresented and resource-constrained entities. We finally identify various open research challenges, including studying of (i) inter-institution heterogeneous privacy regulations, (ii) the non-uniformity of data modalities' characteristics, (iii) the unlearning approaches for M3T FedFMs, (iv) the continual learning frameworks for M3T FedFMs, and (v) M3T FedFM model interpretability, which must be collectively addressed for practical deployment.

CRApr 3, 2021
Mitigating Gradient-based Adversarial Attacks via Denoising and Compression

Rehana Mahfuz, Rajeev Sahay, Aly El Gamal

Gradient-based adversarial attacks on deep neural networks pose a serious threat, since they can be deployed by adding imperceptible perturbations to the test data of any network, and the risk they introduce cannot be assessed through the network's original training performance. Denoising and dimensionality reduction are two distinct methods that have been independently investigated to combat such attacks. While denoising offers the ability to tailor the defense to the specific nature of the attack, dimensionality reduction offers the advantage of potentially removing previously unseen perturbations, along with reducing the training time of the network being defended. We propose strategies to combine the advantages of these two defense mechanisms. First, we propose the cascaded defense, which involves denoising followed by dimensionality reduction. To reduce the training time of the defense for a small trade-off in performance, we propose the hidden layer defense, which involves feeding the output of the encoder of a denoising autoencoder into the network. Further, we discuss how adaptive attacks against these defenses could become significantly weak when an alternative defense is used, or when no defense is used. In this light, we propose a new metric to evaluate a defense which measures the sensitivity of the adaptive attack to modifications in the defense. Finally, we present a guideline for building an ordered repertoire of defenses, a.k.a. a defense infrastructure, that adjusts to limited computational resources in presence of uncertainty about the attack strategy.

SPNov 2, 2020
Frequency-based Automated Modulation Classification in the Presence of Adversaries

Rajeev Sahay, Christopher G. Brinton, David J. Love

Automatic modulation classification (AMC) aims to improve the efficiency of crowded radio spectrums by automatically predicting the modulation constellation of wireless RF signals. Recent work has demonstrated the ability of deep learning to achieve robust AMC performance using raw in-phase and quadrature (IQ) time samples. Yet, deep learning models are highly susceptible to adversarial interference, which cause intelligent prediction models to misclassify received samples with high confidence. Furthermore, adversarial interference is often transferable, allowing an adversary to attack multiple deep learning models with a single perturbation crafted for a particular classification network. In this work, we present a novel receiver architecture consisting of deep learning models capable of withstanding transferable adversarial interference. Specifically, we show that adversarial attacks crafted to fool models trained on time-domain features are not easily transferable to models trained using frequency-domain features. In this capacity, we demonstrate classification performance improvements greater than 30% on recurrent neural networks (RNNs) and greater than 50% on convolutional neural networks (CNNs). We further demonstrate our frequency feature-based classification models to achieve accuracies greater than 99% in the absence of attacks.

LGFeb 22, 2020
Non-Intrusive Detection of Adversarial Deep Learning Attacks via Observer Networks

Kirthi Shankar Sivamani, Rajeev Sahay, Aly El Gamal

Recent studies have shown that deep learning models are vulnerable to specifically crafted adversarial inputs that are quasi-imperceptible to humans. In this letter, we propose a novel method to detect adversarial inputs, by augmenting the main classification network with multiple binary detectors (observer networks) which take inputs from the hidden layers of the original network (convolutional kernel outputs) and classify the input as clean or adversarial. During inference, the detectors are treated as a part of an ensemble network and the input is deemed adversarial if at least half of the detectors classify it as so. The proposed method addresses the trade-off between accuracy of classification on clean and adversarial samples, as the original classification network is not modified during the detection process. The use of multiple observer networks makes attacking the detection mechanism non-trivial even when the attacker is aware of the victim classifier. We achieve a 99.5% detection accuracy on the MNIST dataset and 97.5% on the CIFAR-10 dataset using the Fast Gradient Sign Attack in a semi-white box setup. The number of false positive detections is a mere 0.12% in the worst case scenario.

LGJan 26, 2020
Ensemble Noise Simulation to Handle Uncertainty about Gradient-based Adversarial Attacks

Rehana Mahfuz, Rajeev Sahay, Aly El Gamal

Gradient-based adversarial attacks on neural networks can be crafted in a variety of ways by varying either how the attack algorithm relies on the gradient, the network architecture used for crafting the attack, or both. Most recent work has focused on defending classifiers in a case where there is no uncertainty about the attacker's behavior (i.e., the attacker is expected to generate a specific attack using a specific network architecture). However, if the attacker is not guaranteed to behave in a certain way, the literature lacks methods in devising a strategic defense. We fill this gap by simulating the attacker's noisy perturbation using a variety of attack algorithms based on gradients of various classifiers. We perform our analysis using a pre-processing Denoising Autoencoder (DAE) defense that is trained with the simulated noise. We demonstrate significant improvements in post-attack accuracy, using our proposed ensemble-trained defense, compared to a situation where no effort is made to handle uncertainty.

LGJun 13, 2019
A Computationally Efficient Method for Defending Adversarial Deep Learning Attacks

Rajeev Sahay, Rehana Mahfuz, Aly El Gamal

The reliance on deep learning algorithms has grown significantly in recent years. Yet, these models are highly vulnerable to adversarial attacks, which introduce visually imperceptible perturbations into testing data to induce misclassifications. The literature has proposed several methods to combat such adversarial attacks, but each method either fails at high perturbation values, requires excessive computing power, or both. This letter proposes a computationally efficient method for defending the Fast Gradient Sign (FGS) adversarial attack by simultaneously denoising and compressing data. Specifically, our proposed defense relies on training a fully connected multi-layer Denoising Autoencoder (DAE) and using its encoder as a defense against the adversarial attack. Our results show that using this dimensionality reduction scheme is not only highly effective in mitigating the effect of the FGS attack in multiple threat models, but it also provides a 2.43x speedup in comparison to defense strategies providing similar robustness against the same attack.

LGDec 7, 2018
Combatting Adversarial Attacks through Denoising and Dimensionality Reduction: A Cascaded Autoencoder Approach

Rajeev Sahay, Rehana Mahfuz, Aly El Gamal

Machine Learning models are vulnerable to adversarial attacks that rely on perturbing the input data. This work proposes a novel strategy using Autoencoder Deep Neural Networks to defend a machine learning model against two gradient-based attacks: The Fast Gradient Sign attack and Fast Gradient attack. First we use an autoencoder to denoise the test data, which is trained with both clean and corrupted data. Then, we reduce the dimension of the denoised data using the hidden layer representation of another autoencoder. We perform this experiment for multiple values of the bound of adversarial perturbations, and consider different numbers of reduced dimensions. When the test data is preprocessed using this cascaded pipeline, the tested deep neural network classifier yields a much higher accuracy, thus mitigating the effect of the adversarial perturbation.