AI LGNov 11, 2023

Modeling Choice via Self-Attention

arXiv:2311.07607v23.93 citationsh-index: 3

Originality Highly original

AI Analysis

This addresses a research gap in integrating deep learning with choice modeling, providing both theoretical and empirical foundations for improved optimization in operations management.

The paper tackles the problem of accurately estimating choice models for operations management optimization by proposing the first choice model that leverages self-attention, showing it reduces sample complexity from O(m^2) to O(m) and outperforms existing models in extensive real-data benchmarks.

Models of choice are a fundamental input to many now-canonical optimization problems in the field of Operations Management, including assortment, inventory, and price optimization. Naturally, accurate estimation of these models from data is a critical step in the application of these optimization problems in practice. Concurrently, recent advancements in deep learning have sparked interest in integrating these techniques into choice modeling. However, there is a noticeable research gap at the intersection of deep learning and choice modeling, particularly with both theoretical and empirical foundations. Thus motivated, we first propose a choice model that is the first to successfully (both theoretically and practically) leverage a modern neural network architectural concept (self-attention). Theoretically, we show that our attention-based choice model is a low-rank generalization of the Halo Multinomial Logit (Halo-MNL) model. We prove that whereas the Halo-MNL requires $Ω(m^2)$ data samples to estimate, where $m$ is the number of products, our model supports a natural nonconvex estimator (in particular, that which a standard neural network implementation would apply) which admits a near-optimal stationary point with $O(m)$ samples. Additionally, we establish the first realistic-scale benchmark for choice model estimation on real data, conducting the most extensive evaluation of existing models to date, thereby highlighting our model's superior performance.

View on arXiv PDF

Similar