Multi-Task Triplet Loss for Named Entity Recognition using Supplementary Text
This work addresses the challenge of named entity recognition in retail data with syntactically different text forms, but it is incremental in nature.
The paper tackled the problem of identifying item names in retail text data by using a multi-task triplet loss to contrast embeddings of item titles and descriptions, resulting in small improvements in precision, recall, and significant gains in exact match accuracy.
Retail item data contains many different forms of text like the title of an item, the description of an item, item name and reviews. It is of interest to identify the item name in the other forms of text using a named entity tagger. However, the title of an item and its description are syntactically different (but semantically similar) in that the title is not necessarily a well formed sentence while the description is made up of well formed sentences. In this work, we use a triplet loss to contrast the embeddings of the item title with the description to establish a proof of concept. We find that using the triplet loss in a multi-task NER algorithm improves both the precision and recall by a small percentage. While the improvement is small, we think it is a step in the right direction of using various forms of text in a multi-task algorithm. In addition to precision and recall, the multi task triplet loss method is also found to significantly improve the exact match accuracy i.e. the accuracy of tagging the entire set of tokens in the text with correct tags.