CLJan 31, 2023

ZhichunRoad at Amazon KDD Cup 2022: MultiTask Pre-Training for E-Commerce Product Search

arXiv:2301.13455v12 citationsh-index: 4
Originality Synthesis-oriented
AI Analysis

This work addresses search quality for e-commerce platforms, but it is incremental as it builds on existing pre-training and fine-tuning methods.

The authors tackled improving multilingual e-commerce product search by developing a robust model that uses multitask pre-training and fine-tuning techniques, achieving competitive results and ranking top-8 in three tasks.

In this paper, we propose a robust multilingual model to improve the quality of search results. Our model not only leverage the processed class-balanced dataset, but also benefit from multitask pre-training that leads to more general representations. In pre-training stage, we adopt mlm task, classification task and contrastive learning task to achieve considerably performance. In fine-tuning stage, we use confident learning, exponential moving average method (EMA), adversarial training (FGM) and regularized dropout strategy (R-Drop) to improve the model's generalization and robustness. Moreover, we use a multi-granular semantic unit to discover the queries and products textual metadata for enhancing the representation of the model. Our approach obtained competitive results and ranked top-8 in three tasks. We release the source code and pre-trained models associated with this work.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes