CL AI LGAug 5, 2024

Winning Amazon KDD Cup'24

Chris Deotte, Ivan Sorokin, Ahmet Erdem, Benedikt Schifferer, Gilberto Titericz, Simon Jegou

arXiv:2408.04658v11.92 citationsh-index: 14

Originality Synthesis-oriented

AI Analysis

This is an incremental solution for a specific competition, addressing the problem of creating a useful shopping assistant for participants in the KDD Cup.

The paper tackled the Amazon KDD Cup 2024 Multi Task Online Shopping Challenge by building a single model per track that fine-tuned Qwen2-72B-Instruct with data augmentation and techniques like wise-ft and LoRA ensembling, achieving first place in all 5 tasks and overall.

This paper describes the winning solution of all 5 tasks for the Amazon KDD Cup 2024 Multi Task Online Shopping Challenge for LLMs. The challenge was to build a useful assistant, answering questions in the domain of online shopping. The competition contained 57 diverse tasks, covering 5 different task types (e.g. multiple choice) and across 4 different tracks (e.g. multi-lingual). Our solution is a single model per track. We fine-tune Qwen2-72B-Instruct on our own training dataset. As the competition released only 96 example questions, we developed our own training dataset by processing multiple public datasets or using Large Language Models for data augmentation and synthetic data generation. We apply wise-ft to account for distribution shifts and ensemble multiple LoRA adapters in one model. We employed Logits Processors to constrain the model output on relevant tokens for the tasks. AWQ 4-bit Quantization and vLLM are used during inference to predict the test dataset in the time constraints of 20 to 140 minutes depending on the track. Our solution achieved the first place in each individual track and is the first place overall of Amazons KDD Cup 2024.

View on arXiv PDF

Similar