Mahir Yavuz

LGFeb 2, 2023

adSformers: Personalization from Short-Term Sequences and Diversity of Representations in Etsy Ads

Alaa Awad, Denisa Roberts, Eden Dolev et al.

In this article, we present a general approach to personalizing ads through encoding and learning from variable-length sequences of recent user actions and diverse representations. To this end we introduce a three-component module called the adSformer diversifiable personalization module (ADPM) that learns a dynamic user representation. We illustrate the module's effectiveness and flexibility by personalizing the Click-Through Rate (CTR) and Post-Click Conversion Rate (PCCVR) models used in sponsored search. The first component of the ADPM, the adSformer encoder, includes a novel adSformer block which learns the most salient sequence signals. ADPM's second component enriches the learned signal through visual, multimodal, and other pretrained representations. Lastly, the third ADPM "learned on the fly" component further diversifies the signal encoded in the dynamic user representation. The ADPM-personalized CTR and PCCVR models, henceforth referred to as adSformer CTR and adSformer PCCVR, outperform the CTR and PCCVR production baselines by $+2.66\%$ and $+2.42\%$, respectively, in offline Area Under the Receiver Operating Characteristic Curve (ROC-AUC). Following the robust online gains in A/B tests, Etsy Ads deployed the ADPM-personalized sponsored search system to $100\%$ of traffic as of February 2023.

CVMay 22, 2023

Efficient Large-Scale Visual Representation Learning And Evaluation

Eden Dolev, Alaa Awad, Denisa Roberts et al.

Efficiently learning visual representations of items is vital for large-scale recommendations. In this article we compare several pretrained efficient backbone architectures, both in the convolutional neural network (CNN) and in the vision transformer (ViT) family. We describe challenges in e-commerce vision applications at scale and highlight methods to efficiently train, evaluate, and serve visual representations. We present ablation studies evaluating visual representations in several downstream tasks. To this end, we present a novel multilingual text-to-image generative offline evaluation method for visually similar recommendation systems. Finally, we include online results from deployed machine learning systems in production on a large scale e-commerce platform.

Mahir Yavuz

2 Papers