Mahsa Shoaran

LG
h-index61
13papers
242citations
Novelty41%
AI Score37

13 Papers

AIApr 4, 2022
Challenges and Opportunities of Edge AI for Next-Generation Implantable BMIs

MohammadAli Shaeri, Arshia Afzal, Mahsa Shoaran

Neuroscience and neurotechnology are currently being revolutionized by artificial intelligence (AI) and machine learning. AI is widely used to study and interpret neural signals (analytical applications), assist people with disabilities (prosthetic applications), and treat underlying neurological symptoms (therapeutic applications). In this brief, we will review the emerging opportunities of on-chip AI for the next-generation implantable brain-machine interfaces (BMIs), with a focus on state-of-the-art prosthetic BMIs. Major technological challenges for the effectiveness of AI models will be discussed. Finally, we will present algorithmic and IC design solutions to enable a new generation of AI-enhanced and high-channel-count BMIs.

SPMay 13, 2024
Intelligent and Miniaturized Neural Interfaces: An Emerging Era in Neurotechnology

Mahsa Shoaran, Uisub Shin, MohammadAli Shaeri

Integrating smart algorithms on neural devices presents significant opportunities for various brain disorders. In this paper, we review the latest advancements in the development of three categories of intelligent neural prostheses featuring embedded signal processing on the implantable or wearable device. These include: 1) Neural interfaces for closed-loop symptom tracking and responsive stimulation; 2) Neural interfaces for emerging network-related conditions, such as psychiatric disorders; and 3) Intelligent BMI SoCs for movement recovery following paralysis.

LGFeb 22, 2025
Linear Attention for Efficient Bidirectional Sequence Modeling

Arshia Afzal, Elias Abad Rocamora, Leyla Naz Candogan et al.

Linear Transformers and State Space Models have emerged as efficient alternatives to softmax Transformers for causal sequence modeling, enabling parallel training via matrix multiplication and efficient RNN-style inference. However, despite their success in causal tasks, no unified framework exists for applying Linear Transformers to bidirectional sequence modeling. We introduce LION, the first framework to systematically extend Linear Transformers to the bidirectional setting. LION generalizes three core representations commonly used in the causal case - full Linear Attention , bidirectional RNN, and chunkwise parallel form - to the bidirectional setting. These forms are theoretically equivalent and enable models to exploit the strengths of each during training and inference. We prove that a broad class of Linear Transformers can be extended using LION and validate our framework via three core examples based on the choice of decay type: LION-LIT, the bidirectional extension of arXiv:2006.16236; LION-D, based on arXiv:2307.08621; and LION-S, a variant using selective decay arXiv:2103.02143, arXiv:2312.0075. Across standard bidirectional tasks, LION enables models to match or exceed the performance of softmax Transformers, while offering significantly faster training and more efficient inference than existing State Space Models.

NCAug 19, 2025
BiND: A Neural Discriminator-Decoder for Accurate Bimanual Trajectory Prediction in Brain-Computer Interfaces

Timothee Robert, MohammadAli Shaeri, Mahsa Shoaran

Decoding bimanual hand movements from intracortical recordings remains a critical challenge for brain-computer interfaces (BCIs), due to overlapping neural representations and nonlinear interlimb interactions. We introduce BiND (Bimanual Neural Discriminator-Decoder), a two-stage model that first classifies motion type (unimanual left, unimanual right, or bimanual) and then uses specialized GRU-based decoders, augmented with a trial-relative time index, to predict continuous 2D hand velocities. We benchmark BiND against six state-of-the-art models (SVR, XGBoost, FNN, CNN, Transformer, GRU) on a publicly available 13-session intracortical dataset from a tetraplegic patient. BiND achieves a mean $R^2$ of 0.76 ($\pm$0.01) for unimanual and 0.69 ($\pm$0.03) for bimanual trajectory prediction, surpassing the next-best model (GRU) by 2% in both tasks. It also demonstrates greater robustness to session variability than all other benchmarked models, with accuracy improvements of up to 4% compared to GRU in cross-session analyses. This highlights the effectiveness of task-aware discrimination and temporal modeling in enhancing bimanual decoding.

LGAug 13, 2025
rETF-semiSL: Semi-Supervised Learning for Neural Collapse in Temporal Data

Yuhan Xie, William Cappelletti, Mahsa Shoaran et al.

Deep neural networks for time series must capture complex temporal patterns, to effectively represent dynamic data. Self- and semi-supervised learning methods show promising results in pre-training large models, which -- when finetuned for classification -- often outperform their counterparts trained from scratch. Still, the choice of pretext training tasks is often heuristic and their transferability to downstream classification is not granted, thus we propose a novel semi-supervised pre-training strategy to enforce latent representations that satisfy the Neural Collapse phenomenon observed in optimally trained neural classifiers. We use a rotational equiangular tight frame-classifier and pseudo-labeling to pre-train deep encoders with few labeled samples. Furthermore, to effectively capture temporal dynamics while enforcing embedding separability, we integrate generative pretext tasks with our method, and we define a novel sequential augmentation strategy. We show that our method significantly outperforms previous pretext tasks when applied to LSTMs, transformers, and state-space models on three multivariate time series classification datasets. These results highlight the benefit of aligning pre-training objectives with theoretically grounded embedding geometry.

AIMay 5, 2025
Machine-Learning-Powered Neural Interfaces for Smart Prosthetics and Diagnostics

MohammadAli Shaeri, Jinhan Liu, Mahsa Shoaran

Advanced neural interfaces are transforming applications ranging from neuroscience research to diagnostic tools (for mental state recognition, tremor and seizure detection) as well as prosthetic devices (for motor and communication recovery). By integrating complex functions into miniaturized neural devices, these systems unlock significant opportunities for personalized assistive technologies and adaptive therapeutic interventions. Leveraging high-density neural recordings, on-site signal processing, and machine learning (ML), these interfaces extract critical features, identify disease neuro-markers, and enable accurate, low-latency neural decoding. This integration facilitates real-time interpretation of neural signals, adaptive modulation of brain activity, and efficient control of assistive devices. Moreover, the synergy between neural interfaces and ML has paved the way for self-sufficient, ubiquitous platforms capable of operating in diverse environments with minimal hardware costs and external dependencies. In this work, we review recent advancements in AI-driven decoding algorithms and energy-efficient System-on-Chip (SoC) platforms for next-generation miniaturized neural devices. These innovations highlight the potential for developing intelligent neural interfaces, addressing critical challenges in scalability, reliability, interpretability, and user adaptability.

SPMar 11, 2025
MT-NAM: An Efficient and Adaptive Model for Epileptic Seizure Detection

Arshia Afzal, Volkan Cevher, Mahsa Shoaran

Enhancing the accuracy and efficiency of machine learning algorithms employed in neural interface systems is crucial for advancing next-generation intelligent therapeutic devices. However, current systems often utilize basic machine learning models that do not fully exploit the natural structure of brain signals. Additionally, existing learning models used for neural signal processing often demonstrate low speed and efficiency during inference. To address these challenges, this study introduces Micro Tree-based NAM (MT-NAM), a distilled model based on the recently proposed Neural Additive Models (NAM). The MT-NAM achieves a remarkable 100$\times$ improvement in inference speed compared to standard NAM, without compromising accuracy. We evaluate our approach on the CHB-MIT scalp EEG dataset, which includes recordings from 24 patients with varying numbers of sessions and seizures. NAM achieves an 85.3\% window-based sensitivity and 95\% specificity. Interestingly, our proposed MT-NAM shows only a 2\% reduction in sensitivity compared to the original NAM. To regain this sensitivity, we utilize a test-time template adjuster (T3A) as an update mechanism, enabling our model to achieve higher sensitivity during test time by accommodating transient shifts in neural signals. With this online update approach, MT-NAM achieves the same sensitivity as the standard NAM while achieving approximately 50$\times$ acceleration in inference speed.

SPJun 3, 2024
REST: Efficient and Accelerated EEG Seizure Analysis through Residual State Updates

Arshia Afzal, Grigorios Chrysos, Volkan Cevher et al.

EEG-based seizure detection models face challenges in terms of inference speed and memory efficiency, limiting their real-time implementation in clinical devices. This paper introduces a novel graph-based residual state update mechanism (REST) for real-time EEG signal analysis in applications such as epileptic seizure detection. By leveraging a combination of graph neural networks and recurrent structures, REST efficiently captures both non-Euclidean geometry and temporal dependencies within EEG data. Our model demonstrates high accuracy in both seizure detection and classification tasks. Notably, REST achieves a remarkable 9-fold acceleration in inference speed compared to state-of-the-art models, while simultaneously demanding substantially less memory than the smallest model employed for this task. These attributes position REST as a promising candidate for real-time implementation in clinical devices, such as Responsive Neurostimulation or seizure alert systems.

LGMay 10, 2023
XTab: Cross-table Pretraining for Tabular Transformers

Bingzhao Zhu, Xingjian Shi, Nick Erickson et al.

The success of self-supervised learning in computer vision and natural language processing has motivated pretraining methods on tabular data. However, most existing tabular self-supervised learning models fail to leverage information across multiple data tables and cannot generalize to new tables. In this work, we introduce XTab, a framework for cross-table pretraining of tabular transformers on datasets from various domains. We address the challenge of inconsistent column types and quantities among tables by utilizing independent featurizers and using federated learning to pretrain the shared component. Tested on 84 tabular prediction tasks from the OpenML-AutoML Benchmark (AMLB), we show that (1) XTab consistently boosts the generalizability, learning speed, and performance of multiple tabular transformers, (2) by pretraining FT-Transformer via XTab, we achieve superior performance than other state-of-the-art tabular deep learning models on various tasks such as regression, binary, and multiclass classification.

LGOct 1, 2021
Tree in Tree: from Decision Trees to Decision Graphs

Bingzhao Zhu, Mahsa Shoaran

Decision trees have been widely used as classifiers in many machine learning applications thanks to their lightweight and interpretable decision process. This paper introduces Tree in Tree decision graph (TnT), a framework that extends the conventional decision tree to a more generic and powerful directed acyclic graph. TnT constructs decision graphs by recursively growing decision trees inside the internal or leaf nodes instead of greedy training. The time complexity of TnT is linear to the number of nodes in the graph, and it can construct decision graphs on large datasets. Compared to decision trees, we show that TnT achieves better classification performance with reduced model size, both as a stand-alone classifier and as a base estimator in bagging/AdaBoost ensembles. Our proposed model is a novel, more efficient, and accurate alternative to the widely-used decision trees.

LGFeb 28, 2021
Unsupervised Domain Adaptation for Cross-Subject Few-Shot Neurological Symptom Detection

Bingzhao Zhu, Mahsa Shoaran

Modern machine learning tools have shown promise in detecting symptoms of neurological disorders. However, current approaches typically train a unique classifier for each subject. This subject-specific training scheme requires long labeled recordings from each patient, thus failing to detect symptoms in new patients with limited recordings. This paper introduces an unsupervised domain adaptation approach based on adversarial networks to enable few-shot, cross-subject epileptic seizure detection. Using adversarial learning, features from multiple patients were encoded into a subject-invariant space and a discriminative model was trained on subject-invariant features to make predictions. We evaluated this approach on the intracranial EEG (iEEG) recordings from 9 patients with epilepsy. Our approach enabled cross-subject seizure detection with a 9.4\% improvement in 1-shot classification accuracy compared to the conventional subject-specific scheme.

AROct 15, 2020
Closed-Loop Neural Interfaces with Embedded Machine Learning

Bingzhao Zhu, Uisub Shin, Mahsa Shoaran

Neural interfaces capable of multi-site electrical recording, on-site signal classification, and closed-loop therapy are critical for the diagnosis and treatment of neurological disorders. However, deploying machine learning algorithms on low-power neural devices is challenging, given the tight constraints on computational and memory resources for such devices. In this paper, we review the recent developments in embedding machine learning in neural interfaces, with a focus on design trade-offs and hardware efficiency. We also present our optimized tree-based model for low-power and memory-efficient classification of neural signal in brain implants. Using energy-aware learning and model compression, we show that the proposed oblique trees can outperform conventional machine learning models in applications such as seizure or tremor detection and motor decoding.

LGJun 14, 2020
ResOT: Resource-Efficient Oblique Trees for Neural Signal Classification

Bingzhao Zhu, Masoud Farivar, Mahsa Shoaran

Classifiers that can be implemented on chip with minimal computational and memory resources are essential for edge computing in emerging applications such as medical and IoT devices. This paper introduces a machine learning model based on oblique decision trees to enable resource-efficient classification on a neural implant. By integrating model compression with probabilistic routing and implementing cost-aware learning, our proposed model could significantly reduce the memory and hardware cost compared to state-of-the-art models, while maintaining the classification accuracy. We trained the resource-efficient oblique tree with power-efficient regularization (ResOT-PE) on three neural classification tasks to evaluate the performance, memory, and hardware requirements. On seizure detection task, we were able to reduce the model size by 3.4X and the feature extraction cost by 14.6X compared to the ensemble of boosted trees, using the intracranial EEG from 10 epilepsy patients. In a second experiment, we tested the ResOT-PE model on tremor detection for Parkinson's disease, using the local field potentials from 12 patients implanted with a deep-brain stimulation (DBS) device. We achieved a comparable classification performance as the state-of-the-art boosted tree ensemble, while reducing the model size and feature extraction cost by 10.6X and 6.8X, respectively. We also tested on a 6-class finger movement detection task using ECoG recordings from 9 subjects, reducing the model size by 17.6X and feature computation cost by 5.1X. The proposed model can enable a low-power and memory-efficient implementation of classifiers for real-time neurological disease detection and motor decoding.