Multimodal Entity Linking for Tweets
This work addresses entity linking in social media for applications like information extraction, but it is incremental as it adapts existing multimodal approaches to a new Twitter-specific dataset.
The paper tackles multimodal entity linking for tweets by proposing a method to build an annotated Twitter dataset and a model that jointly learns representations from text and images, showing that using visual information improves performance.
In many information extraction applications, entity linking (EL) has emerged as a crucial task that allows leveraging information about named entities from a knowledge base. In this paper, we address the task of multimodal entity linking (MEL), an emerging research field in which textual and visual information is used to map an ambiguous mention to an entity in a knowledge base (KB). First, we propose a method for building a fully annotated Twitter dataset for MEL, where entities are defined in a Twitter KB. Then, we propose a model for jointly learning a representation of both mentions and entities from their textual and visual contexts. We demonstrate the effectiveness of the proposed model by evaluating it on the proposed dataset and highlight the importance of leveraging visual information when it is available.