Dual-View Distilled BERT for Sentence Embedding
This addresses a specific bottleneck in sentence embedding methods for natural language processing, offering an incremental improvement over existing approaches.
The paper tackled the problem of BERT's performance drop in siamese networks for sentence matching by proposing Dual-view distilled BERT (DvBERT), which uses siamese and interaction views to enhance sentence embeddings, resulting in significant outperformance on six STS tasks.
Recently, BERT realized significant progress for sentence matching via word-level cross sentence attention. However, the performance significantly drops when using siamese BERT-networks to derive two sentence embeddings, which fall short in capturing the global semantic since the word-level attention between two sentences is absent. In this paper, we propose a Dual-view distilled BERT~(DvBERT) for sentence matching with sentence embeddings. Our method deals with a sentence pair from two distinct views, i.e., Siamese View and Interaction View. Siamese View is the backbone where we generate sentence embeddings. Interaction View integrates the cross sentence interaction as multiple teachers to boost the representation ability of sentence embeddings. Experiments on six STS tasks show that our method outperforms the state-of-the-art sentence embedding methods significantly.