IR AI CLOct 28, 2022

Kuaipedia: a Large-scale Multi-modal Short-video Encyclopedia

Haojie Pan, Zepeng Zhai, Yuzhou Zhang, Ruiji Fu, Ming Liu, Yangqiu Song, Zhongyuan Wang, Bing Qin

arXiv:2211.00732v39.711 citationsh-index: 52Has Code

Originality Incremental advance

AI Analysis

This work addresses the need for more dynamic and expressive knowledge representation in encyclopedias, particularly for users seeking practical or visual information, though it is incremental as it builds on existing entity linking and multi-modal analysis techniques.

The authors tackled the problem of traditional online encyclopedias lacking expressive content for certain aspects of items by creating Kuaipedia, a large-scale multi-modal short-video encyclopedia extracted from billions of videos on Kuaishou, which achieved high accuracy in linking videos to item-aspect pairs and improved applications like entity typing and linking.

Online encyclopedias, such as Wikipedia, have been well-developed and researched in the last two decades. One can find any attributes or other information of a wiki item on a wiki page edited by a community of volunteers. However, the traditional text, images and tables can hardly express some aspects of an wiki item. For example, when we talk about ``Shiba Inu'', one may care more about ``How to feed it'' or ``How to train it not to protect its food''. Currently, short-video platforms have become a hallmark in the online world. Whether you're on TikTok, Instagram, Kuaishou, or YouTube Shorts, short-video apps have changed how we consume and create content today. Except for producing short videos for entertainment, we can find more and more authors sharing insightful knowledge widely across all walks of life. These short videos, which we call knowledge videos, can easily express any aspects (e.g. hair or how-to-feed) consumers want to know about an item (e.g. Shiba Inu), and they can be systematically analyzed and organized like an online encyclopedia. In this paper, we propose Kuaipedia, a large-scale multi-modal encyclopedia consisting of items, aspects, and short videos lined to them, which was extracted from billions of videos of Kuaishou (Kwai), a well-known short-video platform in China. We first collected items from multiple sources and mined user-centered aspects from millions of users' queries to build an item-aspect tree. Then we propose a new task called ``multi-modal item-aspect linking'' as an expansion of ``entity linking'' to link short videos into item-aspect pairs and build the whole short-video encyclopedia. Intrinsic evaluations show that our encyclopedia is of large scale and highly accurate. We also conduct sufficient extrinsic experiments to show how Kuaipedia can help fundamental applications such as entity typing and entity linking.

View on arXiv PDF Code

Similar