Towards Developmentally Plausible Rewards: Communicative Success as a Learning Signal for Interactive Language Models
This work addresses the challenge of developing more plausible learning signals for interactive language models, though it is incremental as it builds on prior reference game approaches without yet achieving measurable benefits.
The authors tackled the problem of training language models in interactive settings by proposing a method inspired by child language acquisition, using communicative success as a reward in language-only question-answering, but found no improvements on linguistic evaluations from their training regime.
We propose a method for training language models in an interactive setting inspired by child language acquisition. In our setting, a speaker attempts to communicate some information to a listener in a single-turn dialogue and receives a reward if communicative success is achieved. Unlike earlier related work using image--caption data for interactive reference games, we operationalize communicative success in a more abstract language-only question--answering setting. First, we present a feasibility study demonstrating that our reward provides an indirect signal about grammaticality. Second, we conduct experiments using reinforcement learning to fine-tune language models. We observe that cognitively plausible constraints on the communication channel lead to interpretable changes in speaker behavior. However, we do not yet see improvements on linguistic evaluations from our training regime. We outline potential modifications to the task design and training configuration that could better position future work to use our methodology to observe the benefits of interaction on language learning in computational cognitive models.