AISep 28, 2025

Mix-Ecom: Towards Mixed-Type E-Commerce Dialogues with Complex Domain Rules

arXiv:2509.23836v14 citationsh-index: 24
Originality Synthesis-oriented
AI Analysis

This addresses a gap in benchmarking for e-commerce agents, though it is incremental as it extends existing benchmarking frameworks with new data and evaluation dimensions.

The authors tackled the problem that current e-commerce agent benchmarks lack evaluation of mixed-type dialogues and complex domain rules by introducing Mix-ECom, a novel corpus of 4,799 samples covering multiple dialogue types, task types, and 82 e-commerce rules, and showed that current agents struggle with hallucination due to these complexities.

E-commerce agents contribute greatly to helping users complete their e-commerce needs. To promote further research and application of e-commerce agents, benchmarking frameworks are introduced for evaluating LLM agents in the e-commerce domain. Despite the progress, current benchmarks lack evaluating agents' capability to handle mixed-type e-commerce dialogue and complex domain rules. To address the issue, this work first introduces a novel corpus, termed Mix-ECom, which is constructed based on real-world customer-service dialogues with post-processing to remove user privacy and add CoT process. Specifically, Mix-ECom contains 4,799 samples with multiply dialogue types in each e-commerce dialogue, covering four dialogue types (QA, recommendation, task-oriented dialogue, and chit-chat), three e-commerce task types (pre-sales, logistics, after-sales), and 82 e-commerce rules. Furthermore, this work build baselines on Mix-Ecom and propose a dynamic framework to further improve the performance. Results show that current e-commerce agents lack sufficient capabilities to handle e-commerce dialogues, due to the hallucination cased by complex domain rules. The dataset will be publicly available.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes