280 Birds with One Stone: Inducing Multilingual Taxonomies from Wikipedia using Character-level Classification
This work addresses the need for large-scale, accurate multilingual taxonomies for natural language processing and knowledge representation, though it is incremental as it builds on existing Wikipedia data and methods.
The paper tackled the problem of inducing multilingual taxonomies from Wikipedia by using English taxonomies, interlanguage links, and character-level classifiers, resulting in a resource that significantly outperforms state-of-the-art methods for six languages and spans over 280 languages.
We propose a simple, yet effective, approach towards inducing multilingual taxonomies from Wikipedia. Given an English taxonomy, our approach leverages the interlanguage links of Wikipedia followed by character-level classifiers to induce high-precision, high-coverage taxonomies in other languages. Through experiments, we demonstrate that our approach significantly outperforms the state-of-the-art, heuristics-heavy approaches for six languages. As a consequence of our work, we release presumably the largest and the most accurate multilingual taxonomic resource spanning over 280 languages.