IRAug 13, 2015

Enabling Complex Wikipedia Queries - Technical Report

arXiv:1508.03298v11 citations
Originality Synthesis-oriented
AI Analysis

This work provides a practical tool for researchers and developers in fields like information retrieval and recommender systems, though it is incremental as it builds on existing database and Wikipedia data handling methods.

The authors tackled the problem of enabling complex queries on Wikipedia data by designing a database schema that incorporates features like anchor-text, link locations, redirect pages, and paragraph structure, and they made this schema publicly available for use in applications such as recommender systems, information retrieval, and sentiment analysis.

In this technical report we present a database schema used to store Wikipedia so it can be easily used in query-intensive applications. In addition to storing the information in a way that makes it highly accessible, our schema enables users to easily formulate complex queries using information such as the anchor-text of links and their location in the page, the titles and number of redirect pages for each page and the paragraph structure of entity pages. We have successfully used the schema in domains such as recommender systems, information retrieval and sentiment analysis. In order to assist other researchers, we now make the schema and its content available online.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes