CLAug 17, 2021

Contextualizing Variation in Text Style Transfer Datasets

arXiv:2108.07871v130.9677 citations

Originality Synthesis-oriented

AI Analysis

This work addresses a gap for researchers in natural language processing by providing a framework to guide dataset selection and comparison in text style transfer, though it is incremental as it builds on existing datasets without introducing new methods.

The paper tackled the lack of systematic understanding of how text style transfer datasets relate to each other by conducting empirical analyses, resulting in a proposed categorization of stylistic and dataset properties for better utilization and comparison.

Text style transfer involves rewriting the content of a source sentence in a target style. Despite there being a number of style tasks with available data, there has been limited systematic discussion of how text style datasets relate to each other. This understanding, however, is likely to have implications for selecting multiple data sources for model training. While it is prudent to consider inherent stylistic properties when determining these relationships, we also must consider how a style is realized in a particular dataset. In this paper, we conduct several empirical analyses of existing text style datasets. Based on our results, we propose a categorization of stylistic and dataset properties to consider when utilizing or comparing text style datasets.

View on arXiv PDF

Similar