Will Radford

h-index10

6papers

157citations

Novelty28%

AI Score21

Ranked #183,072 of 194,257 authors (top 94%)#29,786 in CL (top 97%)

6 Papers

1.0CLApr 7, 2021

How to Write a Bias Statement: Recommendations for Submissions to the Workshop on Gender Bias in NLP

Christian Hardmeier, Marta R. Costa-jussà, Kellie Webster et al.

At the Workshop on Gender Bias in NLP (GeBNLP), we'd like to encourage authors to give explicit consideration to the wider aspects of bias and its social implications. For the 2020 edition of the workshop, we therefore requested that all authors include an explicit bias statement in their work to clarify how their work relates to the social context in which NLP systems are used. The programme committee of the workshops included a number of reviewers with a background in the humanities and social sciences, in addition to NLP experts doing the bulk of the reviewing. Each paper was assigned one of those reviewers, and they were asked to pay specific attention to the provided bias statements in their reviews. This initiative was well received by the authors who submitted papers to the workshop, several of whom said they received useful suggestions and literature hints from the bias reviewers. We are therefore planning to keep this feature of the review process in future editions of the workshop.

11.1CLFeb 21, 2017Code

Learning to generate one-sentence biographies from Wikidata

Andrew Chisholm, Will Radford, Ben Hachey

We investigate the generation of one-sentence Wikipedia biographies from facts derived from Wikidata slot-value pairs. We train a recurrent neural network sequence-to-sequence model with attention to select facts and generate textual summaries. Our model incorporates a novel secondary objective that helps ensure it generates sentences that contain the input facts. The model achieves a BLEU score of 41, improving significantly upon the vanilla sequence-to-sequence model and scoring roughly twice that of a simple template baseline. Human preference evaluation suggests the model is nearly as good as the Wikipedia reference. Manual analysis explores content selection, suggesting the model can trade the ability to infer knowledge against the risk of hallucinating incorrect information.

0.3CLFeb 20, 2017

Post-edit Analysis of Collective Biography Generation

Bo Han, Will Radford, Anaïs Cadilhac et al.

Text generation is increasingly common but often requires manual post-editing where high precision is critical to end users. However, manual editing is expensive so we want to ensure this effort is focused on high-value tasks. And we want to maintain stylistic consistency, a particular challenge in crowd settings. We present a case study, analysing human post-editing in the context of a template-based biography generation system. An edit flow visualisation combined with manual characterisation of edits helps identify and prioritise work for improving end-to-end efficiency and accuracy.

0.8CLNov 7, 2016

:telephone::person::sailboat::whale::okhand:; or "Call me Ishmael" - How do you translate emoji?

Will Radford, Andrew Chisholm, Ben Hachey et al.

We report on an exploratory analysis of Emoji Dick, a project that leverages crowdsourcing to translate Melville's Moby Dick into emoji. This distinctive use of emoji removes textual context, and leads to a varying translation quality. In this paper, we use statistical word alignment and part-of-speech tagging to explore how people use emoji. Despite these simple methods, we observed differences in token and part-of-speech distributions. Experiments also suggest that semantics are preserved in the translation, and repetition is more common in emoji.

10.3CLNov 7, 2016

Presenting a New Dataset for the Timeline Generation Problem

Xavier Holt, Will Radford, Ben Hachey

The timeline generation task summarises an entity's biography by selecting stories representing key events from a large pool of relevant documents. This paper addresses the lack of a standard dataset and evaluative methodology for the problem. We present and make publicly available a new dataset of 18,793 news articles covering 39 entities. For each entity, we provide a gold standard timeline and a set of entity-related articles. We propose ROUGE as an evaluation metric and validate our dataset by showing that top Google results outperform straw-man baselines.

3.0CLJul 19, 2016

Discriminating between similar languages in Twitter using label propagation

Will Radford, Matthias Galle

Identifying the language of social media messages is an important first step in linguistic processing. Existing models for Twitter focus on content analysis, which is successful for dissimilar language pairs. We propose a label propagation approach that takes the social graph of tweet authors into account as well as content to better tease apart similar languages. This results in state-of-the-art shared task performance of $76.63\%$, $1.4\%$ higher than the top system.