OpenMic: A Multi-Agent-Based Stand-Up Comedy Generation System
This addresses the problem of automated comedy generation for Chinese audiences, though it appears incremental as it builds on existing multi-agent and retrieval-augmented methods.
The authors tackled the challenge of generating culturally grounded, long-form Chinese stand-up comedy by developing OpenMic, a multi-agent system that transforms user topics into 3-5 minute performances and narrated videos, achieving this through iterative planning and specialized fine-tuning.
Chinese stand-up comedy generation goes beyond plain text generation, requiring culturally grounded humor, precise timing, stage-performance cues, and implicit multi-step reasoning. Moreover, commonly used Chinese humor datasets are often better suited for humor understanding and evaluation than for long-form stand-up generation, making direct supervision misaligned with the target task. To address these challenges, we present OpenMic, an end-to-end multi-agent system built on AutoGen that transforms a user-provided life topic into a 3-5 minute Chinese stand-up performance and further produces a narrated comedy video. OpenMic orchestrates multiple specialized agents in a multi-round iterative loop-planning to jointly optimize humor, timing, and performability. To mitigate the dataset-task mismatch, we augment generation with retrieval-augmented generation (RAG) for material grounding and idea expansion, and we fine-tune a dedicated JokeWriter to better internalize stand-up-specific setup-punchline structures and long-range callbacks.