AIHCJan 12, 2023

Mephisto: A Framework for Portable, Reproducible, and Iterative Crowdsourcing

arXiv:2301.05154v17 citationsh-index: 14Has Code
Originality Synthesis-oriented
AI Analysis

This addresses the need for better tools in ML research to facilitate open-source data collection, though it is incremental as it builds on existing practices.

The authors tackled the problem of making crowdsourcing for research more reproducible and collaborative by introducing Mephisto, a framework that provides abstractions for task designs and workflows to simplify data collection and annotation.

We introduce Mephisto, a framework to make crowdsourcing for research more reproducible, transparent, and collaborative. Mephisto provides abstractions that cover a broad set of task designs and data collection workflows, and provides a simple user experience to make best-practices easy defaults. In this whitepaper we discuss the current state of data collection and annotation in ML research, establish the motivation for building a shared framework to enable researchers to create and open-source data collection and annotation tools as part of their publication, and outline a set of suggested requirements for a system to facilitate these goals. We then step through our resolution in Mephisto, explaining the abstractions we use, our design decisions around the user experience, and share implementation details and where they align with the original motivations. We also discuss current limitations, as well as future work towards continuing to deliver on the framework's initial goals. Mephisto is available as an open source project, and its documentation can be found at www.mephisto.ai.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes