Wikiwhere: An interactive tool for studying the geographical provenance of Wikipedia references
This addresses the issue of potential bias in Wikipedia for researchers and casual readers by providing an accessible tool to study reference distributions, though it is incremental as it builds on existing methods for geographic analysis.
The paper tackles the problem of identifying geographic bias in Wikipedia references across language editions by developing Wikiwhere, a tool that analyzes and visualizes the geographic provenance of external links, enabling users to detect patterns such as differing national contexts in references for the same topic.
Wikipedia articles about the same topic in different language editions are built around different sources of information. For example, one can find very different news articles linked as references in the English Wikipedia article titled "Annexation of Crimea by the Russian Federation" than in its German counterpart (determined via Wikipedia's language links). Some of this difference can of course be attributed to the different language proficiencies of readers and editors in separate language editions, yet, although including English-language news sources seems to be no issue in the German edition, English references that are listed do not overlap highly with the ones in the article's English version. Such patterns could be an indicator of bias towards certain national contexts when referencing facts and statements in Wikipedia. However, determining for each reference which national context it can be traced back to, and comparing the link distributions to each other is infeasible for casual readers or scientists with non-technical backgrounds. Wikiwhere answers the question where Web references stem from by analyzing and visualizing the geographic location of external reference links that are included in a given Wikipedia article. Instead of relying solely on the IP location of a given URL our machine learning models consider several features.