Dual Graph Multitask Framework for Imbalanced Delivery Time Estimation
This work addresses delivery time estimation for e-commerce platforms, offering a solution to data imbalance that can enhance revenue and reduce complaints, though it is incremental as it builds on existing graph-based and imbalanced regression methods.
The paper tackles the problem of imbalanced data in delivery time estimation for e-commerce by proposing a dual graph multitask framework that classifies data into head and tail categories and re-weights tail data embeddings using kernel density estimation, resulting in improved performance on real-world Taobao logistics datasets.
Delivery Time Estimation (DTE) is a crucial component of the e-commerce supply chain that predicts delivery time based on merchant information, sending address, receiving address, and payment time. Accurate DTE can boost platform revenue and reduce customer complaints and refunds. However, the imbalanced nature of industrial data impedes previous models from reaching satisfactory prediction performance. Although imbalanced regression methods can be applied to the DTE task, we experimentally find that they improve the prediction performance of low-shot data samples at the sacrifice of overall performance. To address the issue, we propose a novel Dual Graph Multitask framework for imbalanced Delivery Time Estimation (DGM-DTE). Our framework first classifies package delivery time as head and tail data. Then, a dual graph-based model is utilized to learn representations of the two categories of data. In particular, DGM-DTE re-weights the embedding of tail data by estimating its kernel density. We fuse two graph-based representations to capture both high- and low-shot data representations. Experiments on real-world Taobao logistics datasets demonstrate the superior performance of DGM-DTE compared to baselines.