CLSep 10, 2021

How Does Fine-tuning Affect the Geometry of Embedding Space: A Case Study on Isotropy

arXiv:2109.04740v1668 citations
AI Analysis

This work addresses the limited understanding of why fine-tuning improves performance, focusing on structural changes in embedding spaces for researchers in NLP and machine learning.

The paper investigates how fine-tuning pre-trained language models alters the geometry of embedding spaces, specifically isotropy, and finds that fine-tuning does not improve isotropy and leads to elongated directions that carry linguistic knowledge, making existing enhancement methods ineffective.

It is widely accepted that fine-tuning pre-trained language models usually brings about performance improvements in downstream tasks. However, there are limited studies on the reasons behind this effectiveness, particularly from the viewpoint of structural changes in the embedding space. Trying to fill this gap, in this paper, we analyze the extent to which the isotropy of the embedding space changes after fine-tuning. We demonstrate that, even though isotropy is a desirable geometrical property, fine-tuning does not necessarily result in isotropy enhancements. Moreover, local structures in pre-trained contextual word representations (CWRs), such as those encoding token types or frequency, undergo a massive change during fine-tuning. Our experiments show dramatic growth in the number of elongated directions in the embedding space, which, in contrast to pre-trained CWRs, carry the essential linguistic knowledge in the fine-tuned embedding space, making existing isotropy enhancement methods ineffective.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes