Deep Extrapolation for Attribute-Enhanced Generation
This addresses the problem of generating data beyond training distributions for researchers in natural language processing and computational biology, though it appears incremental as it builds on existing generative modeling approaches.
The paper tackles the challenge of attribute extrapolation in sequence generation by proposing GENhance, a generative framework that enhances attributes like sentiment in text and stability in proteins, achieving strong results without exposure to similar training data.
Attribute extrapolation in sample generation is challenging for deep neural networks operating beyond the training distribution. We formulate a new task for extrapolation in sequence generation, focusing on natural language and proteins, and propose GENhance, a generative framework that enhances attributes through a learned latent space. Trained on movie reviews and a computed protein stability dataset, GENhance can generate strongly-positive text reviews and highly stable protein sequences without being exposed to similar data during training. We release our benchmark tasks and models to contribute to the study of generative modeling extrapolation and data-driven design in biology and chemistry.