CLCYJul 25, 2020

Constructing a Testbed for Psychometric Natural Language Processing

arXiv:2007.12969v12 citations
AI Analysis

This work provides a testbed for psychometric NLP research, enabling timely and unobtrusive analysis of user behaviors in domains like health and e-commerce, but it is incremental as it builds on existing survey-based methods.

The authors tackled the problem of inferring psychometric constructs from user-generated text by constructing a corpus that aligns text with survey responses from over 8,500 respondents, reporting preliminary results on predicting survey labels from text.

Psychometric measures of ability, attitudes, perceptions, and beliefs are crucial for understanding user behaviors in various contexts including health, security, e-commerce, and finance. Traditionally, psychometric dimensions have been measured and collected using survey-based methods. Inferring such constructs from user-generated text could afford opportunities for timely, unobtrusive, collection and analysis. In this paper, we describe our efforts to construct a corpus for psychometric natural language processing (NLP). We discuss our multi-step process to align user text with their survey-based response items and provide an overview of the resulting testbed which encompasses survey-based psychometric measures and accompanying user-generated text from over 8,500 respondents. We report preliminary results on the use of the text to categorize/predict users' survey response labels. We also discuss the important implications of our work and resulting testbed for future psychometric NLP research.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes