WMT24 Test Suite: Gender Resolution in Speaker-Listener Dialogue Roles
This addresses gender bias in NLP for literary applications, but it is incremental as it focuses on a specific test suite.
The study examined gender resolution in literary-style dialogues and the impact of gender stereotypes, finding that external character and manner stereotypes significantly affect gender agreement within dialogue.
We assess the difficulty of gender resolution in literary-style dialogue settings and the influence of gender stereotypes. Instances of the test suite contain spoken dialogue interleaved with external meta-context about the characters and the manner of speaking. We find that character and manner stereotypes outside of the dialogue significantly impact the gender agreement of referents within the dialogue.