William Herlands

h-index9

9papers

101citations

Novelty23%

AI Score18

Ranked #188,554 of 194,257 authors (top 97%)#3,311 in ML (top 98%)

9 Papers

6.1MLApr 4, 2018Code

Gaussian Process Subset Scanning for Anomalous Pattern Detection in Non-iid Data

William Herlands, Edward McFowland, Andrew Gordon Wilson et al.

Identifying anomalous patterns in real-world data is essential for understanding where, when, and how systems deviate from their expected dynamics. Yet methods that separately consider the anomalousness of each individual data point have low detection power for subtle, emerging irregularities. Additionally, recent detection techniques based on subset scanning make strong independence assumptions and suffer degraded performance in correlated data. We introduce methods for identifying anomalous patterns in non-iid data by combining Gaussian processes with novel log-likelihood ratio statistic and subset scanning techniques. Our approaches are powerful, interpretable, and can integrate information across multiple data streams. We illustrate their performance on numeric simulations and three open source spatiotemporal datasets of opioid overdose deaths, 311 calls, and storm reports.

1.2CYDec 21, 2018

Proceedings of NeurIPS 2018 Workshop on Machine Learning for the Developing World: Achieving Sustainable Impact

Maria De-Arteaga, Amanda Coston, William Herlands

This is the Proceedings of NeurIPS 2018 Workshop on Machine Learning for the Developing World: Achieving Sustainable Impact, held in Montreal, Canada on December 8, 2018

2.7MLOct 28, 2018

Change Surfaces for Expressive Multidimensional Changepoints and Counterfactual Prediction

William Herlands, Daniel B. Neill, Hannes Nickisch et al.

Identifying changes in model parameters is fundamental in machine learning and statistics. However, standard changepoint models are limited in expressiveness, often addressing unidimensional problems and assuming instantaneous changes. We introduce change surfaces as a multidimensional and highly expressive generalization of changepoints. We provide a model-agnostic formalization of change surfaces, illustrating how they can provide variable, heterogeneous, and non-monotonic rates of change across multiple dimensions. Additionally, we show how change surfaces can be used for counterfactual prediction. As a concrete instantiation of the change surface framework, we develop Gaussian Process Change Surfaces (GPCS). We demonstrate counterfactual prediction with Bayesian posterior mean and credible sets, as well as massive scalability by introducing novel methods for additive non-separable kernels. Using two large spatio-temporal datasets we employ GPCS to discover and characterize complex changes that can provide scientific and policy relevant insights. Specifically, we analyze twentieth century measles incidence across the United States and discover previously unknown heterogeneous changes after the introduction of the measles vaccine. Additionally, we apply the model to requests for lead testing kits in New York City, discovering distinct spatial and demographic patterns.

6.0MLNov 27, 2017

Proceedings of NIPS 2017 Symposium on Interpretable Machine Learning

Andrew Gordon Wilson, Jason Yosinski, Patrice Simard et al.

This is the Proceedings of NIPS 2017 Symposium on Interpretable Machine Learning, held in Long Beach, California, USA on December 7, 2017

1.0MLNov 27, 2017

Proceedings of NIPS 2017 Workshop on Machine Learning for the Developing World

Maria De-Arteaga, William Herlands

This is the Proceedings of NIPS 2017 Workshop on Machine Learning for the Developing World, held in Long Beach, California, USA on December 8, 2017

4.3CYOct 6, 2017

Machine Learning for Drug Overdose Surveillance

Daniel B. Neill, William Herlands

We describe two recently proposed machine learning approaches for discovering emerging trends in fatal accidental drug overdoses. The Gaussian Process Subset Scan enables early detection of emerging patterns in spatio-temporal data, accounting for both the non-iid nature of the data and the fact that detecting subtle patterns requires integration of information across multiple spatial areas and multiple time steps. We apply this approach to 17 years of county-aggregated data for monthly opioid overdose deaths in the New York City metropolitan area, showing clear advantages in the utility of discovered patterns as compared to typical anomaly detection approaches. To detect and characterize emerging overdose patterns that differentially affect a subpopulation of the data, including geographic, demographic, and behavioral patterns (e.g., which combinations of drugs are involved), we apply the Multidimensional Tensor Scan to 8 years of case-level overdose data from Allegheny County, PA. We discover previously unidentified overdose patterns which reveal unusual demographic clusters, show impacts of drug legislation, and demonstrate potential for early detection and targeted intervention. These approaches to early detection of overdose patterns can inform prevention and response efforts, as well as understanding the effects of policy changes.

5.5MLNov 28, 2016

Proceedings of NIPS 2016 Workshop on Interpretable Machine Learning for Complex Systems

Andrew Gordon Wilson, Been Kim, William Herlands

This is the Proceedings of NIPS 2016 Workshop on Interpretable Machine Learning for Complex Systems, held in Barcelona, Spain on December 9, 2016

7.0MLNov 13, 2015

Scalable Gaussian Processes for Characterizing Multidimensional Change Surfaces

William Herlands, Andrew Wilson, Hannes Nickisch et al.

We present a scalable Gaussian process model for identifying and characterizing smooth multidimensional changepoints, and automatically learning changes in expressive covariance structure. We use Random Kitchen Sink features to flexibly define a change surface in combination with expressive spectral mixture kernels to capture the complex statistical structure. Finally, through the use of novel methods for additive non-separable kernels, we can scale the model to large datasets. We demonstrate the model on numerical and real world data, including a large spatio-temporal disease dataset where we identify previously unknown heterogeneous changes in space and time.

1.5MLNov 13, 2015

Lass-0: sparse non-convex regression by local search

William Herlands, Maria De-Arteaga, Daniel Neill et al.

We compute approximate solutions to L0 regularized linear regression using L1 regularization, also known as the Lasso, as an initialization step. Our algorithm, the Lass-0 ("Lass-zero"), uses a computationally efficient stepwise search to determine a locally optimal L0 solution given any L1 regularization solution. We present theoretical results of consistency under orthogonality and appropriate handling of redundant features. Empirically, we use synthetic data to demonstrate that Lass-0 solutions are closer to the true sparse support than L1 regularization models. Additionally, in real-world data Lass-0 finds more parsimonious solutions than L1 regularization while maintaining similar predictive accuracy.