Can ChatGPT Code Communication Data Fairly?: Empirical Evidence from Multiple Collaborative Tasks
This addresses fairness concerns for using AI in large-scale assessment of collaboration and communication, but is incremental as it builds on prior work showing ChatGPT's coding accuracy.
The study investigated whether ChatGPT-based automated coding of communication data exhibits bias across gender and racial groups, finding no significant bias in three collaborative tasks.
Assessing communication and collaboration at scale depends on a labor intensive task of coding communication data into categories according to different frameworks. Prior research has established that ChatGPT can be directly instructed with coding rubrics to code the communication data and achieves accuracy comparable to human raters. However, whether the coding from ChatGPT or similar AI technology exhibits bias against different demographic groups, such as gender and race, remains unclear. To fill this gap, this paper investigates ChatGPT-based automated coding of communication data using a typical coding framework for collaborative problem solving, examining differences across gender and racial groups. The analysis draws on data from three types of collaborative tasks: negotiation, problem solving, and decision making. Our results show that ChatGPT-based coding exhibits no significant bias across gender and racial groups, paving the road for its adoption in large-scale assessment of collaboration and communication.