IDS at SemEval-2020 Task 10: Does Pre-trained Language Model Know What to Emphasize?
This work addresses emphasis selection for text in visual media, presenting an incremental improvement by leveraging existing PLM attention mechanisms.
The authors tackled the problem of identifying words to emphasize in visual media text by using self-attention distributions from pre-trained language models, achieving superior performance over a TF-IDF baseline in a zero-shot approach.
We propose a novel method that enables us to determine words that deserve to be emphasized from written text in visual media, relying only on the information from the self-attention distributions of pre-trained language models (PLMs). With extensive experiments and analyses, we show that 1) our zero-shot approach is superior to a reasonable baseline that adopts TF-IDF and that 2) there exist several attention heads in PLMs specialized for emphasis selection, confirming that PLMs are capable of recognizing important words in sentences.