Congyi Zhou

h-index3
2papers

2 Papers

AINov 3, 2025
Unbiased Platform-Level Causal Estimation for Search Systems: A Competitive Isolation PSM-DID Framework

Ying Song, Yijing Wang, Hui Yang et al.

Evaluating platform-level interventions in search-based two-sided marketplaces is fundamentally challenged by systemic effects such as spillovers and network interference. While widely used for causal inference, the PSM (Propensity Score Matching) - DID (Difference-in-Differences) framework remains susceptible to selection bias and cross-unit interference from unaccounted spillovers. In this paper, we introduced Competitive Isolation PSM-DID, a novel causal framework that integrates propensity score matching with competitive isolation to enable platform-level effect measurement (e.g., order volume, GMV) instead of item-level metrics in search systems. Our approach provides theoretically guaranteed unbiased estimation under mutual exclusion conditions, with an open dataset released to support reproducible research on marketplace interference (github.com/xxxx). Extensive experiments demonstrate significant reductions in interference effects and estimation variance compared to baseline methods. Successful deployment in a large-scale marketplace confirms the framework's practical utility for platform-level causal inference.

HCFeb 8
Generative AI in Action: Field Experimental Evidence from Alibaba's Customer Service Operations

Xiao Ni, Yiwei Wang, Tianjun Feng et al.

In collaboration with Alibaba, this study leverages a large-scale field experiment to assess the impact of a generative AI assistant on worker performance in e-commerce after-sales service. Human agents providing digital chat support were randomly assigned with access to a gen AI assistant that offered two core functions: diagnosis of customer issues and solution proposals, presented as text messages. Agents retained discretion to adopt, modify, or disregard AI-generated messages. To evaluate gen AI's impact, we estimate both the intention-to-treat (ITT) effect of gen AI access and the local average treatment effect (LATE) of gen AI usage. Results show that gen AI significantly improved service speed, measured by issue identification time and chat duration. Gen AI also improved subjective service quality reflected in customer ratings and dissatisfaction rates, but it had no significant effect on objective service quality indicated by customer retrial rates. The performance improvements stemmed not only from automation but also from changes in the dynamics of agent-customer interactions: agent communication became more informative and efficient, while customers experienced reduced communication burdens. Low performers achieved the greatest improvements in both service speed and quality, narrowing the performance gap. In contrast, top-performing agents showed little improvement in service speed but experienced declines in both subjective and objective service quality. Evidence suggests that this decline results from increased multitasking tendency, proxied by longer shift-away times across concurrent chats, which slowed customer responses and raised abandonment and retrial rates. These findings suggest that gen AI reshapes work, demanding tailored deployment strategies.