Exploring Semi-Supervised Learning for Predicting Listener Backchannels
This work offers an incremental improvement in data annotation efficiency for researchers developing conversational agents, specifically for backchannel prediction.
This paper addresses the bottleneck of manual annotation for listener backchannel prediction in conversational agents by proposing a semi-supervised learning approach. The method achieved 95% of the performance of models trained on manually-annotated data, and a user study found that almost 60% of participants perceived the semi-supervised model's backchannel responses as more natural.
Developing human-like conversational agents is a prime area in HCI research and subsumes many tasks. Predicting listener backchannels is one such actively-researched task. While many studies have used different approaches for backchannel prediction, they all have depended on manual annotations for a large dataset. This is a bottleneck impacting the scalability of development. To this end, we propose using semi-supervised techniques to automate the process of identifying backchannels, thereby easing the annotation process. To analyze our identification module's feasibility, we compared the backchannel prediction models trained on (a) manually-annotated and (b) semi-supervised labels. Quantitative analysis revealed that the proposed semi-supervised approach could attain 95% of the former's performance. Our user-study findings revealed that almost 60% of the participants found the backchannel responses predicted by the proposed model more natural. Finally, we also analyzed the impact of personality on the type of backchannel signals and validated our findings in the user-study.