CLSIJun 17, 2016

Gender Inference using Statistical Name Characteristics in Twitter

arXiv:1606.05467v230 citations
Originality Incremental advance
AI Analysis

This addresses the challenge of gender inference for social media analysis, though it is incremental as it builds on existing name-based methods by handling noisy data.

The paper tackled the problem of inferring gender from Twitter user names, which often contain ill-formed or non-dictionary entries, by proposing a novel classifier that extracts statistical characteristics from names to assign gender, achieving classification of international and ill-formed names.

Much attention has been given to the task of gender inference of Twitter users. Although names are strong gender indicators, the names of Twitter users are rarely used as a feature; probably due to the high number of ill-formed names, which cannot be found in any name dictionary. Instead of relying solely on a name database, we propose a novel name classifier. Our approach extracts characteristics from the user names and uses those in order to assign the names to a gender. This enables us to classify international first names as well as ill-formed names.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes