Multi-Agent Comedy Club: Investigating Community Discussion Effects on LLM Humor Generation
This addresses the underexamined issue of persistent public reception in online communities for LLM humor generation, representing an incremental advance over prior work on multi-turn interaction and feedback.
The study tackled the problem of improving LLM-generated stand-up comedy by incorporating broadcast community discussion, finding that this approach won 75.6% of preferences and improved Craft/Clarity and Social Response scores by 0.440 and 0.422, respectively.
Prior work has explored multi-turn interaction and feedback for LLM writing, but evaluations still largely center on prompts and localized feedback, leaving persistent public reception in online communities underexamined. We test whether broadcast community discussion improves stand-up comedy writing in a controlled multi-agent sandbox: in the discussion condition, critic and audience threads are recorded, filtered, stored as social memory, and later retrieved to condition subsequent generations, whereas the baseline omits discussion. Across 50 rounds (250 paired monologues) judged by five expert annotators using A/B preference and a 15-item rubric, discussion wins 75.6% of instances and improves Craft/Clarity (Δ = 0.440) and Social Response (Δ = 0.422), with occasional increases in aggressive humor.