CorrEmbed: Evaluating Pre-trained Model Image Similarity Efficacy with a Novel Metric
This work addresses a gap in evaluating pre-trained models for image similarity, offering a tool for researchers and practitioners in domains like fashion retail, though it is incremental as it builds on existing embedding methods.
The paper tackles the problem of evaluating pre-trained computer vision models for image similarity tasks by introducing CorrEmbed, a metric that correlates embedding distances with human-generated tag vectors, revealing a linear scaling relationship with ImageNet1k accuracy and identifying deviations to provide insights into feature capture.
Detecting visually similar images is a particularly useful attribute to look to when calculating product recommendations. Embedding similarity, which utilizes pre-trained computer vision models to extract high-level image features, has demonstrated remarkable efficacy in identifying images with similar compositions. However, there is a lack of methods for evaluating the embeddings generated by these models, as conventional loss and performance metrics do not adequately capture their performance in image similarity search tasks. In this paper, we evaluate the viability of the image embeddings from numerous pre-trained computer vision models using a novel approach named CorrEmbed. Our approach computes the correlation between distances in image embeddings and distances in human-generated tag vectors. We extensively evaluate numerous pre-trained Torchvision models using this metric, revealing an intuitive relationship of linear scaling between ImageNet1k accuracy scores and tag-correlation scores. Importantly, our method also identifies deviations from this pattern, providing insights into how different models capture high-level image features. By offering a robust performance evaluation of these pre-trained models, CorrEmbed serves as a valuable tool for researchers and practitioners seeking to develop effective, data-driven approaches to similar item recommendations in fashion retail.