Shengyu Cao

GT
h-index1
3papers
3citations
Novelty53%
AI Score40

3 Papers

86.4THMar 22
Collusive Pricing Under LLM

Shengyu Cao, Ming Hu

We study how delegating pricing to large language models (LLMs) can facilitate collusion in a duopoly when both sellers rely on the same pre-trained model. The LLM is characterized by (i) a propensity parameter capturing its internal bias toward high-price recommendations and (ii) an output-fidelity parameter measuring how tightly outputs track that bias; the propensity evolves through retraining. We show that configuring LLMs for robustness and reproducibility can induce collusion via a phase transition: there exists a critical output-fidelity threshold that pins down long-run behavior. Below it, competitive pricing is the unique long-run outcome. Above it, the system is bistable, with competitive and collusive pricing both locally stable and the realized outcome determined by the model's initial preference. The collusive regime resembles tacit collusion: prices are elevated on average, yet occasional low-price recommendations provide plausible deniability. With perfect fidelity, full collusion emerges from any interior initial condition. For finite training batches of size $b$, infrequent retraining (driven by computational costs) further amplifies collusion: conditional on starting in the collusive basin, the probability of collusion approaches one as $b$ grows, since larger batches dampen stochastic fluctuations that might otherwise tip the system toward competition. The indeterminacy region shrinks at rate $O(1/\sqrt{b})$.

65.1GTMar 21
A Solicit-Then-Suggest Model of Agentic Purchasing

Shengyu Cao, Ming Hu

E-commerce is shifting from search-based shopping to agentic purchasing. Rather than relying on keywords, AI shopping agents learn customer preferences through targeted multi-round conversations and then recommend a tailored set of products. We develop a solicit-then-suggest framework to study this setting. In a d-dimensional preference space, an agent conducts m rounds of solicitation to refine its belief about the customer's ideal product, then recommends k products from which the customer chooses. Our analysis identifies the key economic tradeoff. Under a Gaussian prior, we establish an uncertainty decomposition: solicitation depth and assortment breadth are substitutes, with total prior uncertainty split between what solicitation resolves and what assortment breadth hedges. The two instruments improve match quality at very different rates. Expected loss decreases on the order of 1/m with solicitation depth, but only on the order of k^(-2/d) with assortment breadth, reflecting a curse of dimensionality. Thus, a few well-designed questions can achieve what would otherwise require far more recommendations. We also characterize the optimal policy. The optimal assortment forms a Voronoi partition, assigning each product to the posterior region it best serves. With a single recommended product, the optimal solicitation follows a water-filling rule that equalizes posterior uncertainty across dimensions. With multiple products, the optimum may allocate less precision to dimensions that the assortment can hedge. This single-product water-filling rule also yields a general approximation guarantee for larger assortments, and the gap vanishes as dimension grows. Beyond the Gaussian case, the uncertainty decomposition and substitutability between solicitation depth and assortment breadth continue to hold for non-Gaussian priors.

MLDec 21, 2023
Best Arm Identification in Batched Multi-armed Bandit Problems

Shengyu Cao, Simai He, Ruoqing Jiang et al.

Recently multi-armed bandit problem arises in many real-life scenarios where arms must be sampled in batches, due to limited time the agent can wait for the feedback. Such applications include biological experimentation and online marketing. The problem is further complicated when the number of arms is large and the number of batches is small. We consider pure exploration in a batched multi-armed bandit problem. We introduce a general linear programming framework that can incorporate objectives of different theoretical settings in best arm identification. The linear program leads to a two-stage algorithm that can achieve good theoretical properties. We demonstrate by numerical studies that the algorithm also has good performance compared to certain UCB-type or Thompson sampling methods.