CVAug 17, 2023

FashionLOGO: Prompting Multimodal Large Language Models for Fashion Logo Embeddings

arXiv:2308.09012v23 citationsh-index: 35
AI Analysis

This addresses logo recognition and detection for e-commerce platforms, enhancing intellectual property enforcement and product search, but it is incremental as it builds on existing multimodal approaches.

The paper tackled the problem of logo embedding for e-commerce by proposing FashionLOGO, a method that uses multimodal large language models to generate text prompts from product images, which improved visual models for logo embeddings and achieved state-of-the-art performance on benchmarks.

Logo embedding models convert the product logos in images into vectors, enabling their utilization for logo recognition and detection within e-commerce platforms. This facilitates the enforcement of intellectual property rights and enhances product search capabilities. However, current methods treat logo embedding as a purely visual problem. A noteworthy issue is that visual models capture features more than logos. Instead, we view this as a multimodal task, using text as auxiliary information to facilitate the visual model's understanding of the logo. The emerging Multimodal Large Language Models (MLLMs) have demonstrated remarkable capabilities in both visual and textual understanding. Inspired by this, we propose an approach, \textbf{FashionLOGO}, to explore how to prompt MLLMs to generate appropriate text for product images, which can help visual models achieve better logo embeddings. We adopt a cross-attention transformer block that enables visual embedding to automatically learn supplementary knowledge from textual embedding. Our extensive experiments on real-world datasets prove that FashionLOGO is capable of generating generic and robust logo embeddings, achieving state-of-the-art performance in all benchmarks.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes