Vec2Sent: Probing Sentence Embeddings with Natural Language Generation
This provides a new unsupervised probing method for evaluating sentence embeddings, which is incremental but useful for researchers in NLP.
The paper tackles the problem of introspecting black-box sentence embeddings by generating natural language from them to retrieve the original sentences, showing that this probing task correlates with downstream performance and enabling applications like generating sentence analogies.
We introspect black-box sentence embeddings by conditionally generating from them with the objective to retrieve the underlying discrete sentence. We perceive of this as a new unsupervised probing task and show that it correlates well with downstream task performance. We also illustrate how the language generated from different encoders differs. We apply our approach to generate sentence analogies from sentence embeddings.