Madalina Fiterau

LG
h-index15
18papers
263citations
Novelty45%
AI Score40

18 Papers

LGSep 1, 2022
Progressive Fusion for Multimodal Integration

Shiv Shankar, Laure Thompson, Madalina Fiterau · princeton

Integration of multimodal information from various sources has been shown to boost the performance of machine learning models and thus has received increased attention in recent years. Often such models use deep modality-specific networks to obtain unimodal features which are combined to obtain "late-fusion" representations. However, these designs run the risk of information loss in the respective unimodal pipelines. On the other hand, "early-fusion" methodologies, which combine features early, suffer from the problems associated with feature heterogeneity and high sample complexity. In this work, we present an iterative representation refinement approach, called Progressive Fusion, which mitigates the issues with late fusion representations. Our model-agnostic technique introduces backward connections that make late stage fused representations available to early layers, improving the expressiveness of the representations at those stages, while retaining the advantages of late fusion designs. We test Progressive Fusion on tasks including affective sentiment detection, multimedia analysis, and time series fusion with different models, demonstrating its versatility. We show that our approach consistently improves performance, for instance attaining a 5% reduction in MSE and 40% improvement in robustness on multimodal time series prediction.

IVOct 7, 2023
Machine Learning for Automated Mitral Regurgitation Detection from Cardiac Imaging

Ke Xiao, Erik Learned-Miller, Evangelos Kalogerakis et al.

Mitral regurgitation (MR) is a heart valve disease with potentially fatal consequences that can only be forestalled through timely diagnosis and treatment. Traditional diagnosis methods are expensive, labor-intensive and require clinical expertise, posing a barrier to screening for MR. To overcome this impediment, we propose a new semi-supervised model for MR classification called CUSSP. CUSSP operates on cardiac imaging slices of the 4-chamber view of the heart. It uses standard computer vision techniques and contrastive models to learn from large amounts of unlabeled data, in conjunction with specialized classifiers to establish the first ever automated MR classification system. Evaluated on a test set of 179 labeled -- 154 non-MR and 25 MR -- sequences, CUSSP attains an F1 score of 0.69 and a ROC-AUC score of 0.88, setting the first benchmark result for this new task.

LGJun 16, 2023
MultiWave: Multiresolution Deep Architectures through Wavelet Decomposition for Multivariate Time Series Prediction

Iman Deznabi, Madalina Fiterau

The analysis of multivariate time series data is challenging due to the various frequencies of signal changes that can occur over both short and long terms. Furthermore, standard deep learning models are often unsuitable for such datasets, as signals are typically sampled at different rates. To address these issues, we introduce MultiWave, a novel framework that enhances deep learning time series models by incorporating components that operate at the intrinsic frequencies of signals. MultiWave uses wavelets to decompose each signal into subsignals of varying frequencies and groups them into frequency bands. Each frequency band is handled by a different component of our model. A gating mechanism combines the output of the components to produce sparse models that use only specific signals at specific frequencies. Our experiments demonstrate that MultiWave accurately identifies informative frequency bands and improves the performance of various deep learning models, including LSTM, Transformer, and CNN-based models, for a wide range of applications. It attains top performance in stress and affect detection from wearables. It also increases the AUC of the best-performing model by 5% for in-hospital COVID-19 mortality prediction from patient blood samples and for human activity recognition from accelerometer and gyroscope data. We show that MultiWave consistently identifies critical features and their frequency components, thus providing valuable insights into the applications studied.

LGApr 16, 2024
A/B testing under Interference with Partial Network Information

Shiv Shankar, Ritwik Sinha, Yash Chandak et al.

A/B tests are often required to be conducted on subjects that might have social connections. For e.g., experiments on social media, or medical and social interventions to control the spread of an epidemic. In such settings, the SUTVA assumption for randomized-controlled trials is violated due to network interference, or spill-over effects, as treatments to group A can potentially also affect the control group B. When the underlying social network is known exactly, prior works have demonstrated how to conduct A/B tests adequately to estimate the global average treatment effect (GATE). However, in practice, it is often impossible to obtain knowledge about the exact underlying network. In this paper, we present UNITE: a novel estimator that relax this assumption and can identify GATE while only relying on knowledge of the superset of neighbors for any subject in the graph. Through theoretical analysis and extensive experiments, we show that the proposed approach performs better in comparison to standard estimators.

LGDec 16, 2024
Leveraging Foundation Language Models (FLMs) for Automated Cohort Extraction from Large EHR Databases

Purity Mugambi, Alexandra Meliou, Madalina Fiterau

A crucial step in cohort studies is to extract the required cohort from one or more study datasets. This step is time-consuming, especially when a researcher is presented with a dataset that they have not previously worked with. When the cohort has to be extracted from multiple datasets, cohort extraction can be extremely laborious. In this study, we present an approach for partially automating cohort extraction from multiple electronic health record (EHR) databases. We formulate the guided multi-dataset cohort extraction problem in which selection criteria are first converted into queries, translating them from natural language text to language that maps to database entities. Then, using FLMs, columns of interest identified from the queries are automatically matched between the study databases. Finally, the generated queries are run across all databases to extract the study cohort. We propose and evaluate an algorithm for automating column matching on two large, popular and publicly-accessible EHR databases -- MIMIC-III and eICU. Our approach achieves a high top-three accuracy of $92\%$, correctly matching $12$ out of the $13$ columns of interest, when using a small, pre-trained general purpose language model. Furthermore, this accuracy is maintained even as the search space (i.e., size of the database) increases.

LGOct 19, 2025
Resolution-Aware Retrieval Augmented Zero-Shot Forecasting

Iman Deznabi, Peeyush Kumar, Madalina Fiterau

Zero-shot forecasting aims to predict outcomes for previously unseen conditions without direct historical data, posing a significant challenge for traditional forecasting methods. We introduce a Resolution-Aware Retrieval-Augmented Forecasting model that enhances predictive accuracy by leveraging spatial correlations and temporal frequency characteristics. By decomposing signals into different frequency components, our model employs resolution-aware retrieval, where lower-frequency components rely on broader spatial context, while higher-frequency components focus on local influences. This allows the model to dynamically retrieve relevant data and adapt to new locations with minimal historical context. Applied to microclimate forecasting, our model significantly outperforms traditional forecasting methods, numerical weather prediction models, and modern foundation time series models, achieving 71% lower MSE than HRRR and 34% lower MSE than Chronos on the ERA5 dataset. Our results highlight the effectiveness of retrieval-augmented and resolution-aware strategies, offering a scalable and data-efficient solution for zero-shot forecasting in microclimate modeling and beyond.

LGSep 2, 2025
Challenges in Understanding Modality Conflict in Vision-Language Models

Trang Nguyen, Jackson Michaels, Madalina Fiterau et al.

This paper highlights the challenge of decomposing conflict detection from conflict resolution in Vision-Language Models (VLMs) and presents potential approaches, including using a supervised metric via linear probes and group-based attention pattern analysis. We conduct a mechanistic investigation of LLaVA-OV-7B, a state-of-the-art VLM that exhibits diverse resolution behaviors when faced with conflicting multimodal inputs. Our results show that a linearly decodable conflict signal emerges in the model's intermediate layers and that attention patterns associated with conflict detection and resolution diverge at different stages of the network. These findings support the hypothesis that detection and resolution are functionally distinct mechanisms. We discuss how such decomposition enables more actionable interpretability and targeted interventions for improving model robustness in challenging multimodal settings.

LGJan 5, 2024
Zero-shot Microclimate Prediction with Deep Learning

Iman Deznabi, Peeyush Kumar, Madalina Fiterau

Weather station data is a valuable resource for climate prediction, however, its reliability can be limited in remote locations. To compound the issue, making local predictions often relies on sensor data that may not be accessible for a new, previously unmonitored location. In response to these challenges, we propose a novel zero-shot learning approach designed to forecast various climate measurements at new and unmonitored locations. Our method surpasses conventional weather forecasting techniques in predicting microclimate variables by leveraging knowledge extracted from other geographic locations.

LGJul 18, 2020
Structure Mapping for Transferability of Causal Models

Purva Pruthi, Javier González, Xiaoyu Lu et al.

Human beings learn causal models and constantly use them to transfer knowledge between similar environments. We use this intuition to design a transfer-learning framework using object-oriented representations to learn the causal relationships between objects. A learned causal dynamics model can be used to transfer between variants of an environment with exchangeable perceptual features among objects but with the same underlying causal dynamics. We adapt continuous optimization for structure learning techniques to explicitly learn the cause and effects of the actions in an interactive environment and transfer to the target domain by categorization of the objects based on causal knowledge. We demonstrate the advantages of our approach in a gridworld setting by combining causal model-based approach with model-free approach in reinforcement learning.

MLJun 23, 2020
Normalizing Flows Across Dimensions

Edmond Cunningham, Renos Zabounidis, Abhinav Agrawal et al.

Real-world data with underlying structure, such as pictures of faces, are hypothesized to lie on a low-dimensional manifold. This manifold hypothesis has motivated state-of-the-art generative algorithms that learn low-dimensional data representations. Unfortunately, a popular generative model, normalizing flows, cannot take advantage of this. Normalizing flows are based on successive variable transformations that are, by design, incapable of learning lower-dimensional representations. In this paper we introduce noisy injective flows (NIF), a generalization of normalizing flows that can go across dimensions. NIF explicitly map the latent space to a learnable manifold in a high-dimensional data space using injective transformations. We further employ an additive noise model to account for deviations from the manifold and identify a stochastic inverse of the generative process. Empirically, we demonstrate that a simple application of our method to existing flow architectures can significantly improve sample quality and yield separable data embeddings.

LGJun 26, 2019
Personalized Student Stress Prediction with Deep Multitask Network

Abhinav Shaw, Natcha Simsiri, Iman Deznaby et al.

With the growing popularity of wearable devices, the ability to utilize physiological data collected from these devices to predict the wearer's mental state such as mood and stress suggests great clinical applications, yet such a task is extremely challenging. In this paper, we present a general platform for personalized predictive modeling of behavioural states like students' level of stress. Through the use of Auto-encoders and Multitask learning we extend the prediction of stress to both sequences of passive sensor data and high-level covariates. Our model outperforms the state-of-the-art in the prediction of stress level from mobile sensor data, obtaining a 45.6 % improvement in F1 score on the StudentLife dataset.

IVJun 10, 2019
Alzheimer's Disease Brain MRI Classification: Challenges and Insights

Yi Ren Fung, Ziqiang Guan, Ritesh Kumar et al.

In recent years, many papers have reported state-of-the-art performance on Alzheimer's Disease classification with MRI scans from the Alzheimer's Disease Neuroimaging Initiative (ADNI) dataset using convolutional neural networks. However, we discover that when we split that data into training and testing sets at the subject level, we are not able to obtain similar performance, bringing the validity of many of the previous studies into question. Furthermore, we point out that previous works use different subsets of the ADNI data, making comparison across similar works tricky. In this study, we present the results of three splitting methods, discuss the motivations behind their validity, and report our results using all of the available subjects.

LGApr 30, 2019
Multi-resolution Networks For Flexible Irregular Time Series Modeling (Multi-FIT)

Bhanu Pratap Singh, Iman Deznabi, Bharath Narasimhan et al.

Missing values, irregularly collected samples, and multi-resolution signals commonly occur in multivariate time series data, making predictive tasks difficult. These challenges are especially prevalent in the healthcare domain, where patients' vital signs and electronic records are collected at different frequencies and have occasionally missing information due to the imperfections in equipment or patient circumstances. Researchers have handled each of these issues differently, often handling missing data through mean value imputation and then using sequence models over the multivariate signals while ignoring the different resolution of signals. We propose a unified model named Multi-resolution Flexible Irregular Time series Network (Multi-FIT). The building block for Multi-FIT is the FIT network. The FIT network creates an informative dense representation at each time step using signal information such as last observed value, time difference since the last observed time stamp and overall mean for the signal. Vertical FIT (FIT-V) is a variant of FIT which also models the relationship between different temporal signals while creating the informative dense representations for the signal. The multi-FIT model uses multiple FIT networks for sets of signals with different resolutions, further facilitating the construction of flexible representations. Our model has three main contributions: a.) it does not impute values but rather creates informative representations to provide flexibility to the model for creating task-specific representations b.) it models the relationship between different signals in the form of support signals c.) it models different resolutions in parallel before merging them for the final prediction task. The FIT, FIT-V and Multi-FIT networks improve upon the state-of-the-art models for three predictive tasks, including the forecasting of patient survival.

LGApr 17, 2019
FLARe: Forecasting by Learning Anticipated Representations

Surya Teja Devarakonda, Joie Yeahuay Wu, Yi Ren Fung et al.

Computational models that forecast the progression of Alzheimer's disease at the patient level are extremely useful tools for identifying high risk cohorts for early intervention and treatment planning. The state-of-the-art work in this area proposes models that forecast by using latent representations extracted from the longitudinal data across multiple modalities, including volumetric information extracted from medical scans and demographic info. These models incorporate the time horizon, which is the amount of time between the last recorded visit and the future visit, by directly concatenating a representation of it to the data latent representation. In this paper, we present a model which generates a sequence of latent representations of the patient status across the time horizon, providing more informative modeling of the temporal relationships between the patient's history and future visits. Our proposed model outperforms the baseline in terms of forecasting accuracy and F1 score with the added benefit of robustly handling missing visits.

CVApr 16, 2019
A Comprehensive Study of Alzheimer's Disease Classification Using Convolutional Neural Networks

Ziqiang Guan, Ritesh Kumar, Yi Ren Fung et al.

A plethora of deep learning models have been developed for the task of Alzheimer's disease classification from brain MRI scans. Many of these models report high performance, achieving three-class classification accuracy of up to 95%. However, it is common for these studies to draw performance comparisons between models that are trained on different subsets of a dataset or use varying imaging preprocessing techniques, making it difficult to objectively assess model performance. Furthermore, many of these works do not provide details such as hyperparameters, the specific MRI scans used, or their source code, making it difficult to replicate their experiments. To address these concerns, we present a comprehensive study of some of the deep learning methods and architectures on the full set of images available from ADNI. We find that, (1) classification using 3D models gives an improvement of 1% in our setup, at the cost of significantly longer training time and more computation power, (2) with our dataset, pre-training yields minimal ($<0.5\%$) improvement in model performance, (3) most popular convolutional neural network models yield similar performance when compared to each other. Lastly, we briefly compare the effects of two image preprocessing programs: FreeSurfer and Clinica, and find that the spatially normalized and segmented outputs from Clinica increased the accuracy of model prediction from 63% to 89% when compared to FreeSurfer images.

CVApr 15, 2019
Pedestrian Detection in Thermal Images using Saliency Maps

Debasmita Ghose, Shasvat Mukeshkumar Desai, Sneha Bhattacharya et al.

Thermal images are mainly used to detect the presence of people at night or in bad lighting conditions, but perform poorly at daytime. To solve this problem, most state-of-the-art techniques employ a fusion network that uses features from paired thermal and color images. Instead, we propose to augment thermal images with their saliency maps, to serve as an attention mechanism for the pedestrian detector especially during daytime. We investigate how such an approach results in improved performance for pedestrian detection using only thermal images, eliminating the need for paired color images. For our experiments, we train the Faster R-CNN for pedestrian detection and report the added effect of saliency maps generated using static and deep methods (PiCA-Net and R3-Net). Our best performing model results in an absolute reduction of miss rate by 13.4% and 19.4% over the baseline in day and night images respectively. We also annotate and release pixel level masks of pedestrians on a subset of the KAIST Multispectral Pedestrian Detection dataset, which is a first publicly available dataset for salient pedestrian detection.

LGNov 17, 2018
Machine Learning for Health (ML4H) Workshop at NeurIPS 2018

Natalia Antropova, Andrew L. Beam, Brett K. Beaulieu-Jones et al.

This volume represents the accepted submissions from the Machine Learning for Health (ML4H) workshop at the conference on Neural Information Processing Systems (NeurIPS) 2018, held on December 8, 2018 in Montreal, Canada.

MLMay 13, 2017
ShortFuse: Biomedical Time Series Representations in the Presence of Structured Information

Madalina Fiterau, Suvrat Bhooshan, Jason Fries et al.

In healthcare applications, temporal variables that encode movement, health status and longitudinal patient evolution are often accompanied by rich structured information such as demographics, diagnostics and medical exam data. However, current methods do not jointly optimize over structured covariates and time series in the feature extraction process. We present ShortFuse, a method that boosts the accuracy of deep learning models for time series by explicitly modeling temporal interactions and dependencies with structured covariates. ShortFuse introduces hybrid convolutional and LSTM cells that incorporate the covariates via weights that are shared across the temporal domain. ShortFuse outperforms competing models by 3% on two biomedical applications, forecasting osteoarthritis-related cartilage degeneration and predicting surgical outcomes for cerebral palsy patients, matching or exceeding the accuracy of models that use features engineered by domain experts.