Global Entity Ranking Across Multiple Languages
This work addresses the challenge of entity ranking for multilingual applications, though it appears incremental as it builds on existing knowledge bases and features.
The paper tackles the problem of ranking entities globally across multiple languages using Wikipedia and Freebase, achieving 75% precision and 48% F1 score on a dataset of 27 million entities.
We present work on building a global long-tailed ranking of entities across multiple languages using Wikipedia and Freebase knowledge bases. We identify multiple features and build a model to rank entities using a ground-truth dataset of more than 10 thousand labels. The final system ranks 27 million entities with 75% precision and 48% F1 score. We provide performance evaluation and empirical evidence of the quality of ranking across languages, and open the final ranked lists for future research.