Retrieval-Augmented Classification with Decoupled Representation
This work addresses a specific bottleneck in retrieval-augmented classification for machine learning practitioners, offering an incremental improvement over existing methods.
The paper tackles the problem of noise sensitivity and limited expandability in retrieval-augmented classification by proposing a KNN-based method that interpolates label distributions with retrieved instances, using a decoupling mechanism to improve performance and stability. Experimental results show the method is effective and robust across various classification datasets.
Retrieval augmented methods have shown promising results in various classification tasks. However, existing methods focus on retrieving extra context to enrich the input, which is noise sensitive and non-expandable. In this paper, following this line, we propose a $k$-nearest-neighbor (KNN) -based method for retrieval augmented classifications, which interpolates the predicted label distribution with retrieved instances' label distributions. Different from the standard KNN process, we propose a decoupling mechanism as we find that shared representation for classification and retrieval hurts performance and leads to training instability. We evaluate our method on a wide range of classification datasets. Experimental results demonstrate the effectiveness and robustness of our proposed method. We also conduct extra experiments to analyze the contributions of different components in our model.\footnote{\url{https://github.com/xnliang98/knn-cls-w-decoupling}}