CYLGMEOct 26, 2014

Random Sampling in an Age of Automation: Minimizing Expenditures through Balanced Collection and Annotation

arXiv:1410.7074v41 citations
Originality Incremental advance
AI Analysis

This addresses cost reduction in sampling surveys for applications like ecological monitoring, but it is incremental as it builds on existing sampling methods with a hybrid approach.

The paper tackles the problem of estimating a population mean under new cost-structures for automated collection and annotation, proposing a Hybrid-Offset sampling design that uses accurate but costly and noisy but cheap annotators to minimize expenditures, with simulations showing a 50% reduction in sampling costs compared to a conventional design in a coral reef survey.

Methods for automated collection and annotation are changing the cost-structures of sampling surveys for a wide range of applications. Digital samples in the form of images or audio recordings can be collected rapidly, and annotated by computer programs or crowd workers. We consider the problem of estimating a population mean under these new cost-structures, and propose a Hybrid-Offset sampling design. This design utilizes two annotators: a primary, which is accurate but costly (e.g. a human expert) and an auxiliary which is noisy but cheap (e.g. a computer program), in order to minimize total sampling expenditures. Our analysis gives necessary conditions for the Hybrid-Offset design and specifies optimal sample sizes for both annotators. Simulations on data from a coral reef survey program indicate that the Hybrid-Offset design outperforms several alternative sampling designs. In particular, sampling expenditures are reduced 50% compared to the Conventional design currently deployed by the coral ecologists.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes