MLLGAPMEMay 22, 2018

Variational Learning on Aggregate Outputs with Gaussian Processes

arXiv:1805.08463v140 citations
AI Analysis

This addresses a critical issue in applications like global disease mapping where data granularity mismatches hinder accurate predictions, offering a scalable solution with explicit uncertainty handling.

The paper tackles the problem of supervised learning when outputs are aggregated at a coarser level than inputs, such as in disease mapping, by proposing a variational learning approach with Gaussian processes and new bounds to handle intractability. It achieves improved prediction accuracy and scalability, demonstrated on malaria incidence modeling with over 1 million observations.

While a typical supervised learning framework assumes that the inputs and the outputs are measured at the same levels of granularity, many applications, including global mapping of disease, only have access to outputs at a much coarser level than that of the inputs. Aggregation of outputs makes generalization to new inputs much more difficult. We consider an approach to this problem based on variational learning with a model of output aggregation and Gaussian processes, where aggregation leads to intractability of the standard evidence lower bounds. We propose new bounds and tractable approximations, leading to improved prediction accuracy and scalability to large datasets, while explicitly taking uncertainty into account. We develop a framework which extends to several types of likelihoods, including the Poisson model for aggregated count data. We apply our framework to a challenging and important problem, the fine-scale spatial modelling of malaria incidence, with over 1 million observations.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes