Teacher Demonstrations in a BabyLM's Zone of Proximal Development for Contingent Multi-Turn Interaction
This work addresses dialogue quality for BabyLMs, but it is incremental as it builds on existing methods with modest improvements.
The paper tackled the problem of improving multi-turn contingency in BabyLMs by introducing ContingentChat, a teacher-student framework that uses post-training on an alignment dataset, resulting in more grammatical and cohesive responses but with limited gains from adaptive strategies.
Multi-turn dialogues between a child and a caregiver are characterized by a property called contingency - that is, prompt, direct, and meaningful exchanges between interlocutors. We introduce ContingentChat, a teacher-student framework that benchmarks and improves multi-turn contingency in a BabyLM trained on 100M words. Using a novel alignment dataset for post-training, BabyLM generates responses that are more grammatical and cohesive. Experiments with adaptive teacher decoding strategies show limited additional gains. ContingentChat demonstrates the benefits of targeted post-training for dialogue quality and indicates that contingency remains a challenging goal for BabyLMs.