Rachel Grotheer

CV
9papers
1,038citations
Novelty33%
AI Score24

9 Papers

SPJun 14, 2018
Sparse Randomized Kaczmarz for Support Recovery of Jointly Sparse Corrupted Multiple Measurement Vectors

Natalie Durgin, Rachel Grotheer, Chenxi Huang et al.

While single measurement vector (SMV) models have been widely studied in signal processing, there is a surging interest in addressing the multiple measurement vectors (MMV) problem. In the MMV setting, more than one measurement vector is available and the multiple signals to be recovered share some commonalities such as a common support. Applications in which MMV is a naturally occurring phenomenon include online streaming, medical imaging, and video recovery. This work presents a stochastic iterative algorithm for the support recovery of jointly sparse corrupted MMV. We present a variant of the Sparse Randomized Kaczmarz algorithm for corrupted MMV and compare our proposed method with an existing Kaczmarz type algorithm for MMV problems. We also showcase the usefulness of our approach in the online (streaming) setting and provide empirical evidence that suggests the robustness of the proposed method to the distribution of the corruption and the number of corruptions occurring.

CVAug 28, 2022
Automatic Infectious Disease Classification Analysis with Concept Discovery

Elena Sizikova, Joshua Vendrow, Xu Cao et al.

Automatic infectious disease classification from images can facilitate needed medical diagnoses. Such an approach can identify diseases, like tuberculosis, which remain under-diagnosed due to resource constraints and also novel and emerging diseases, like monkeypox, which clinicians have little experience or acumen in diagnosing. Avoiding missed or delayed diagnoses would prevent further transmission and improve clinical outcomes. In order to understand and trust neural network predictions, analysis of learned representations is necessary. In this work, we argue that automatic discovery of concepts, i.e., human interpretable attributes, allows for a deep understanding of learned information in medical image analysis tasks, generalizing beyond the training labels or protocols. We provide an overview of existing concept discovery approaches in medical image and computer vision communities, and evaluate representative methods on tuberculosis (TB) prediction and monkeypox prediction tasks. Finally, we propose NMFx, a general NMF formulation of interpretability by concept discovery that works in a unified way in unsupervised, weakly supervised, and supervised scenarios.

ITJun 19, 2018
Compressed Anomaly Detection with Multiple Mixed Observations

Natalie Durgin, Rachel Grotheer, Chenxi Huang et al.

We consider a collection of independent random variables that are identically distributed, except for a small subset which follows a different, anomalous distribution. We study the problem of detecting which random variables in the collection are governed by the anomalous distribution. Recent work proposes to solve this problem by conducting hypothesis tests based on mixed observations (e.g. linear combinations) of the random variables. Recognizing the connection between taking mixed observations and compressed sensing, we view the problem as recovering the "support" (index set) of the anomalous random variables from multiple measurement vectors (MMVs). Many algorithms have been developed for recovering jointly sparse signals and their support from MMVs. We establish the theoretical and empirical effectiveness of these algorithms at detecting anomalies. We also extend the LASSO algorithm to an MMV version for our purpose. Further, we perform experiments on synthetic data, consisting of samples from the random variables, to explore the trade-off between the number of mixed observations per sample and the number of samples required to detect anomalies.

SPJun 7, 2023
Stochastic Natural Thresholding Algorithms

Rachel Grotheer, Shuang Li, Anna Ma et al.

Sparse signal recovery is one of the most fundamental problems in various applications, including medical imaging and remote sensing. Many greedy algorithms based on the family of hard thresholding operators have been developed to solve the sparse signal recovery problem. More recently, Natural Thresholding (NT) has been proposed with improved computational efficiency. This paper proposes and discusses convergence guarantees for stochastic natural thresholding algorithms by extending the NT from the deterministic version with linear measurements to the stochastic version with a general objective function. We also conduct various numerical experiments on linear and nonlinear measurements to demonstrate the performance of StoNT.

NAMar 2, 2018
Alternatives for Generating a Reduced Basis to Solve the Hyperspectral Diffuse Optical Tomography Model

Rachel Grotheer, Thilo Strauss, Phil Gralla et al.

The Reduced Basis Method (RBM) is a model reduction technique used to solve parametric PDEs that relies upon a basis set of solutions to the PDE at specific parameter values. To generate this reduced basis, the set of a small number of parameter values must be strategically chosen. We apply a Metropolis algorithm and a gradient algorithm to find the set of parameters and compare them to the standard greedy algorithm most commonly used in the RBM. We test our methods by using the RBM to solve a simplified version of the governing partial differential equation for hyperspectral diffuse optical tomography (hyDOT). The governing equation for hyDOT is an elliptic PDE parameterized by the wavelength of the laser source. For this one-dimensional problem, we find that both the Metropolis and gradient algorithms are potentially superior alternatives to the greedy algorithm in that they generate a reduced basis which produces solutions with a smaller relative error with respect to solutions found using the finite element method and in less time.

IRFeb 28, 2022
Semi-supervised Nonnegative Matrix Factorization for Document Classification

Jamie Haddock, Lara Kassab, Sixian Li et al.

We propose new semi-supervised nonnegative matrix factorization (SSNMF) models for document classification and provide motivation for these models as maximum likelihood estimators. The proposed SSNMF models simultaneously provide both a topic model and a model for classification, thereby offering highly interpretable classification results. We derive training methods using multiplicative updates for each new model, and demonstrate the application of these models to single-label and multi-label document classification, although the models are flexible to other supervised learning tasks such as regression. We illustrate the promise of these models and training methods on document classification datasets (e.g., 20 Newsgroups, Reuters).

LGOct 15, 2020
Semi-supervised NMF Models for Topic Modeling in Learning Tasks

Jamie Haddock, Lara Kassab, Sixian Li et al.

We propose several new models for semi-supervised nonnegative matrix factorization (SSNMF) and provide motivation for SSNMF models as maximum likelihood estimators given specific distributions of uncertainty. We present multiplicative updates training methods for each new model, and demonstrate the application of these models to classification, although they are flexible to other supervised learning tasks. We illustrate the promise of these models and training methods on both synthetic and real data, and achieve high classification accuracy on the 20 Newsgroups dataset.

DLSep 7, 2020
COVID-19 Literature Topic-Based Search via Hierarchical NMF

Rachel Grotheer, Yihuan Huang, Pengyu Li et al.

A dataset of COVID-19-related scientific literature is compiled, combining the articles from several online libraries and selecting those with open access and full text available. Then, hierarchical nonnegative matrix factorization is used to organize literature related to the novel coronavirus into a tree structure that allows researchers to search for relevant literature based on detected topics. We discover eight major latent topics and 52 granular subtopics in the body of literature, related to vaccines, genetic structure and modeling of the disease and patient studies, as well as related diseases and virology. In order that our tool may help current researchers, an interactive website is created that organizes available literature using this hierarchical structure.

NAAug 22, 2019
Iterative Hard Thresholding for Low CP-rank Tensor Models

Rachel Grotheer, Shuang Li, Anna Ma et al.

Recovery of low-rank matrices from a small number of linear measurements is now well-known to be possible under various model assumptions on the measurements. Such results demonstrate robustness and are backed with provable theoretical guarantees. However, extensions to tensor recovery have only recently began to be studied and developed, despite an abundance of practical tensor applications. Recently, a tensor variant of the Iterative Hard Thresholding method was proposed and theoretical results were obtained that guarantee exact recovery of tensors with low Tucker rank. In this paper, we utilize the same tensor version of the Restricted Isometry Property (RIP) to extend these results for tensors with low CANDECOMP/PARAFAC (CP) rank. In doing so, we leverage recent results on efficient approximations of CP decompositions that remove the need for challenging assumptions in prior works. We complement our theoretical findings with empirical results that showcase the potential of the approach.