CVAIDBJun 20, 2024

SafeSora: Towards Safety Alignment of Text2Video Generation via a Human Preference Dataset

arXiv:2406.14477v128 citations
Originality Synthesis-oriented
AI Analysis

This addresses safety risks in text-to-video generation for AI developers and users, though it's incremental as it builds on existing alignment research with a new dataset.

The authors tackled the problem of harmful outputs from large vision models in text-to-video generation by creating the SafeSora dataset with 14,711 prompts, 57,333 videos, and 51,691 human preference annotations to align models with human values on helpfulness and harmlessness.

To mitigate the risk of harmful outputs from large vision models (LVMs), we introduce the SafeSora dataset to promote research on aligning text-to-video generation with human values. This dataset encompasses human preferences in text-to-video generation tasks along two primary dimensions: helpfulness and harmlessness. To capture in-depth human preferences and facilitate structured reasoning by crowdworkers, we subdivide helpfulness into 4 sub-dimensions and harmlessness into 12 sub-categories, serving as the basis for pilot annotations. The SafeSora dataset includes 14,711 unique prompts, 57,333 unique videos generated by 4 distinct LVMs, and 51,691 pairs of preference annotations labeled by humans. We further demonstrate the utility of the SafeSora dataset through several applications, including training the text-video moderation model and aligning LVMs with human preference by fine-tuning a prompt augmentation module or the diffusion model. These applications highlight its potential as the foundation for text-to-video alignment research, such as human preference modeling and the development and validation of alignment algorithms.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes