LGAIMar 20, 2025

Leveraging OpenFlamingo for Multimodal Embedding Analysis of C2C Car Parts Data

arXiv:2503.17408v1h-index: 2
Originality Synthesis-oriented
AI Analysis

This work addresses pattern discovery in multimodal online marketplaces for car parts, but it is incremental as it applies an existing method to new data without major innovations.

The paper tackled the problem of analyzing large-scale multimodal data from consumer-to-consumer car parts posts by applying the OpenFlamingo model to extract embeddings and using k-means clustering, finding that most clusters showed patterns but some did not, indicating the model's utility with dataset-specific modifications.

In this paper, we aim to investigate the capabilities of multimodal machine learning models, particularly the OpenFlamingo model, in processing a large-scale dataset of consumer-to-consumer (C2C) online posts related to car parts. We have collected data from two platforms, OfferUp and Craigslist, resulting in a dataset of over 1.2 million posts with their corresponding images. The OpenFlamingo model was used to extract embeddings for the text and image of each post. We used $k$-means clustering on the joint embeddings to identify underlying patterns and commonalities among the posts. We have found that most clusters contain a pattern, but some clusters showed no internal patterns. The results provide insight into the fact that OpenFlamingo can be used for finding patterns in large datasets but needs some modification in the architecture according to the dataset.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes