HC CLFeb 11, 2023

Synthesizing Human Gaze Feedback for Improved NLP Performance

Varun Khurana, Yaman Kumar Singla, Nora Hollenstein, Rajesh Kumar, Balaji Krishnamurthy

ETH Zurich

arXiv:2302.05721v145.8274 citationsh-index: 35

Originality Incremental advance

AI Analysis

This addresses the problem of expensive and privacy-invasive data collection for NLP researchers, offering a synthetic alternative to enhance model performance, though it is incremental as it builds on prior eye-tracking and NLP research.

The paper tackles the challenge of collecting real eye-tracking data for NLP tasks by proposing ScanTextGAN to generate synthetic human scanpaths, and shows that models augmented with these scanpaths improve performance across four NLP tasks on six datasets.

Integrating human feedback in models can improve the performance of natural language processing (NLP) models. Feedback can be either explicit (e.g. ranking used in training language models) or implicit (e.g. using human cognitive signals in the form of eyetracking). Prior eye tracking and NLP research reveal that cognitive processes, such as human scanpaths, gleaned from human gaze patterns aid in the understanding and performance of NLP models. However, the collection of real eyetracking data for NLP tasks is challenging due to the requirement of expensive and precise equipment coupled with privacy invasion issues. To address this challenge, we propose ScanTextGAN, a novel model for generating human scanpaths over text. We show that ScanTextGAN-generated scanpaths can approximate meaningful cognitive signals in human gaze patterns. We include synthetically generated scanpaths in four popular NLP tasks spanning six different datasets as proof of concept and show that the models augmented with generated scanpaths improve the performance of all downstream NLP tasks.

View on arXiv PDF

Similar