CVAug 28, 2022
Automatic Infectious Disease Classification Analysis with Concept DiscoveryElena Sizikova, Joshua Vendrow, Xu Cao et al.
Automatic infectious disease classification from images can facilitate needed medical diagnoses. Such an approach can identify diseases, like tuberculosis, which remain under-diagnosed due to resource constraints and also novel and emerging diseases, like monkeypox, which clinicians have little experience or acumen in diagnosing. Avoiding missed or delayed diagnoses would prevent further transmission and improve clinical outcomes. In order to understand and trust neural network predictions, analysis of learned representations is necessary. In this work, we argue that automatic discovery of concepts, i.e., human interpretable attributes, allows for a deep understanding of learned information in medical image analysis tasks, generalizing beyond the training labels or protocols. We provide an overview of existing concept discovery approaches in medical image and computer vision communities, and evaluate representative methods on tuberculosis (TB) prediction and monkeypox prediction tasks. Finally, we propose NMFx, a general NMF formulation of interpretability by concept discovery that works in a unified way in unsupervised, weakly supervised, and supervised scenarios.
COMar 3, 2018
Density Tracking by Quadrature for Stochastic Differential EquationsHarish S. Bhat, R. W. M. A. Madushani
We develop and analyze a method, density tracking by quadrature (DTQ), to compute the probability density function of the solution of a stochastic differential equation. The derivation of the method begins with the discretization in time of the stochastic differential equation, resulting in a discrete-time Markov chain with continuous state space. At each time step, DTQ applies quadrature to solve the Chapman-Kolmogorov equation for this Markov chain. In this paper, we focus on a particular case of the DTQ method that arises from applying the Euler-Maruyama method in time and the trapezoidal quadrature rule in space. Our main result establishes that the density computed by DTQ converges in $L^1$ to both the exact density of the Markov chain (with exponential convergence rate), and to the exact density of the stochastic differential equation (with first-order convergence rate). We establish a Chernoff bound that implies convergence of a domain-truncated version of DTQ. We carry out numerical tests to show that the empirical performance of DTQ matches theoretical results, and also to demonstrate that DTQ can compute densities several times faster than a Fokker-Planck solver, for the same level of error.
LGOct 15, 2020
Semi-supervised NMF Models for Topic Modeling in Learning TasksJamie Haddock, Lara Kassab, Sixian Li et al.
We propose several new models for semi-supervised nonnegative matrix factorization (SSNMF) and provide motivation for SSNMF models as maximum likelihood estimators given specific distributions of uncertainty. We present multiplicative updates training methods for each new model, and demonstrate the application of these models to classification, although they are flexible to other supervised learning tasks. We illustrate the promise of these models and training methods on both synthetic and real data, and achieve high classification accuracy on the 20 Newsgroups dataset.
LGJan 2, 2020
On Large-Scale Dynamic Topic Modeling with Nonnegative CP Tensor DecompositionMiju Ahn, Nicole Eikmeier, Jamie Haddock et al.
There is currently an unprecedented demand for large-scale temporal data analysis due to the explosive growth of data. Dynamic topic modeling has been widely used in social and data sciences with the goal of learning latent topics that emerge, evolve, and fade over time. Previous work on dynamic topic modeling primarily employ the method of nonnegative matrix factorization (NMF), where slices of the data tensor are each factorized into the product of lower-dimensional nonnegative matrices. With this approach, however, information contained in the temporal dimension of the data is often neglected or underutilized. To overcome this issue, we propose instead adopting the method of nonnegative CANDECOMP/PARAPAC (CP) tensor decomposition (NNCPD), where the data tensor is directly decomposed into a minimal sum of outer products of nonnegative vectors, thereby preserving the temporal information. The viability of NNCPD is demonstrated through application to both synthetic and real data, where significantly improved results are obtained compared to those of typical NMF-based methods. The advantages of NNCPD over such approaches are studied and discussed. To the best of our knowledge, this is the first time that NNCPD has been utilized for the purpose of dynamic topic modeling, and our findings will be transformative for both applications and further developments.