IRCLOct 19, 2012

1 Billion Pages = 1 Million Dollars? Mining the Web to Play "Who Wants to be a Millionaire?"

arXiv:1212.2477v129 citations
Originality Incremental advance
AI Analysis

This work addresses the challenge of automated question-answering from unstructured web sources, demonstrating practical applications in trivia games, though it is incremental in combining existing techniques.

The researchers tackled the problem of creating a computerized player for the game show 'Who Wants To Be A Millionaire?' by mining web data, achieving about 75% correct answers on trivia questions and performing comparably to humans in simulated gameplay.

We exploit the redundancy and volume of information on the web to build a computerized player for the ABC TV game show 'Who Wants To Be A Millionaire?' The player consists of a question-answering module and a decision-making module. The question-answering module utilizes question transformation techniques, natural language parsing, multiple information retrieval algorithms, and multiple search engines; results are combined in the spirit of ensemble learning using an adaptive weighting scheme. Empirically, the system correctly answers about 75% of questions from the Millionaire CD-ROM, 3rd edition - general-interest trivia questions often about popular culture and common knowledge. The decision-making module chooses from allowable actions in the game in order to maximize expected risk-adjusted winnings, where the estimated probability of answering correctly is a function of past performance and confidence in in correctly answering the current question. When given a six question head start (i.e., when starting from the $2,000 level), we find that the system performs about as well on average as humans starting at the beginning. Our system demonstrates the potential of simple but well-chosen techniques for mining answers from unstructured information such as the web.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes