MMCVSIApr 4, 2019

MMED: A Multi-domain and Multi-modality Event Dataset

arXiv:1904.02354v211 citations
Originality Synthesis-oriented
AI Analysis

This dataset addresses the problem of handling and summarizing heterogeneous data from professionals and amateurs for researchers in event discovery and cross-modal retrieval, but it is incremental as it builds on existing event datasets.

The authors constructed and released a multi-domain and multi-modality event dataset (MMED) with 25,165 textual news articles and 76,516 image posts annotated for 412 real-world events, aiming to facilitate research on organizing heterogeneous data and transferring event knowledge across domains.

In this work, we construct and release a multi-domain and multi-modality event dataset (MMED), containing 25,165 textual news articles collected from hundreds of news media sites (e.g., Yahoo News, Google News, CNN News.) and 76,516 image posts shared on Flickr social media, which are annotated according to 412 real-world events. The dataset is collected to explore the problem of organizing heterogeneous data contributed by professionals and amateurs in different data domains, and the problem of transferring event knowledge obtained from one data domain to heterogeneous data domain, thus summarizing the data with different contributors. We hope that the release of the MMED dataset can stimulate innovate research on related challenging problems, such as event discovery, cross-modal (event) retrieval, and visual question answering, etc.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes