CVOct 18, 2022Code
ATCON: Attention Consistency for Vision ModelsAli Mirzazadeh, Florian Dubost, Maxwell Pike et al. · stanford
Attention--or attribution--maps methods are methods designed to highlight regions of the model's input that were discriminative for its predictions. However, different attention maps methods can highlight different regions of the input, with sometimes contradictory explanations for a prediction. This effect is exacerbated when the training set is small. This indicates that either the model learned incorrect representations or that the attention maps methods did not accurately estimate the model's representations. We propose an unsupervised fine-tuning method that optimizes the consistency of attention maps and show that it improves both classification performance and the quality of attention maps. We propose an implementation for two state-of-the-art attention computation methods, Grad-CAM and Guided Backpropagation, which relies on an input masking technique. We also show results on Grad-CAM and Integrated Gradients in an ablation study. We evaluate this method on our own dataset of event detection in continuous video recordings of hospital patients aggregated and curated for this work. As a sanity check, we also evaluate the proposed method on PASCAL VOC and SVHN. With the proposed method, with small training sets, we achieve a 6.6 points lift of F1 score over the baselines on our video dataset, a 2.9 point lift of F1 score on PASCAL, and a 1.8 points lift of mean Intersection over Union over Grad-CAM for weakly supervised detection on PASCAL. Those improved attention maps may help clinicians better understand vision model predictions and ease the deployment of machine learning systems into clinical care. We share part of the code for this article at the following repository: https://github.com/alimirzazadeh/SemisupervisedAttention.
AIMay 11
CLEF: EEG Foundation Model for Learning Clinical SemanticsPeng Cao, Ali Mirzazadeh, Jong Woo Lee et al.
Clinical EEG interpretation requires reasoning over full EEG sessions and integrating signal patterns with clinical context. Existing EEG foundation models are largely designed for short-window decoding and do not incorporate clinical context. We introduce CLEF, a clinically grounded long-context EEG foundation model. CLEF represents EEG sessions as 3D multitaper spectrogram tokens, enabling tractable Transformer modeling at session scale, and aligns embeddings with neurologist reports and structured EHR data through contrastive objectives. We evaluate CLEF on a new 234-task benchmark spanning disease phenotypes, medication exposures, and EEG findings, with more than 260k EEG sessions from over 108k patients. CLEF outperforms prior EEG foundation models on 229 of 234 tasks, improving mean AUROC from 0.65 to 0.74. Reconstruction-only pretraining surpasses prior EEG foundation models, while report and EHR alignment yields further gains. Held-out concept and external-cohort experiments suggest that these representations transfer beyond observed alignment targets. These results support session-scale, clinically grounded representation learning as a promising foundation-model paradigm for clinical EEG.
LGOct 11, 2025
Transformer Model Detects Antidepressant Use From a Single Night of Sleep, Unlocking an Adherence BiomarkerAli Mirzazadeh, Simon Cadavid, Kaiwen Zha et al.
Antidepressant nonadherence is pervasive, driving relapse, hospitalization, suicide risk, and billions in avoidable costs. Clinicians need tools that detect adherence lapses promptly, yet current methods are either invasive (serum assays, neuroimaging) or proxy-based and inaccurate (pill counts, pharmacy refills). We present the first noninvasive biomarker that detects antidepressant intake from a single night of sleep. A transformer-based model analyzes sleep data from a consumer wearable or contactless wireless sensor to infer antidepressant intake, enabling remote, effortless, daily adherence assessment at home. Across six datasets comprising 62,000 nights from >20,000 participants (1,800 antidepressant users), the biomarker achieved AUROC = 0.84, generalized across drug classes, scaled with dose, and remained robust to concomitant psychotropics. Longitudinal monitoring captured real-world initiation, tapering, and lapses. This approach offers objective, scalable adherence surveillance with potential to improve depression care and outcomes.
HCAug 20, 2021
MARS: Nano-Power Battery-free Wireless Interfaces for Touch, Swipe and Speech InputNivedita Arora, Ali Mirzazadeh, Injoo Moon et al.
Augmenting everyday surfaces with interaction sensing capability that is maintenance-free, low-cost (about $1), and in an appropriate form factor is a challenge with current technologies. MARS (Multi-channel Ambiently-powered Realtime Sensing) enables battery-free sensing and wireless communication of touch, swipe, and speech interactions by combining a nanowatt programmable oscillator with frequency-shifted analog backscatter communication. A zero-threshold voltage field-effect transistor (FET) is used to create an oscillator with a low startup voltage (about 500 mV) and current (< 2uA), whose frequency can be affected through changes in inductance or capacitance from the user interactions. Multiple MARS systems can operate in the same environment by tuning each oscillator circuit to a different frequency range. The nanowatt power budget allows the system to be powered directly through ambient energy sources like photodiodes or thermoelectric generators. We differentiate MARS from previous systems based on power requirements, cost, and part count and explore different interaction and activity sensing scenarios suitable for indoor environments.