AICLIRLGMMSep 2, 2023

Zero-Shot Recommendations with Pre-Trained Large Language Models for Multimodal Nudging

arXiv:2309.01026v214 citations
Originality Synthesis-oriented
AI Analysis

This addresses the problem of multimodal content recommendation for users in zero-shot settings, but it appears incremental as it adapts existing LLM techniques to a specific domain.

The paper tackles the problem of zero-shot recommendation for multimodal non-stationary content by rendering inputs as textual descriptions and using pre-trained LLMs to compute semantic embeddings, then performing recommendations via similarity metrics without additional learning. The result is demonstrated on a synthetic multimodal nudging environment with tabular, textual, and visual data, but no concrete numbers are provided.

We present a method for zero-shot recommendation of multimodal non-stationary content that leverages recent advancements in the field of generative AI. We propose rendering inputs of different modalities as textual descriptions and to utilize pre-trained LLMs to obtain their numerical representations by computing semantic embeddings. Once unified representations of all content items are obtained, the recommendation can be performed by computing an appropriate similarity metric between them without any additional learning. We demonstrate our approach on a synthetic multimodal nudging environment, where the inputs consist of tabular, textual, and visual data.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes