CVLGJan 16, 2023

Diverse Multimedia Layout Generation with Multi Choice Learning

arXiv:2301.06629v118 citationsh-index: 67
Originality Highly original
AI Analysis

This addresses the need for creative and user-specific layout generation in design applications, offering a novel approach beyond incremental improvements.

The paper tackles the problem of generating diverse and acceptable layouts for multimedia documents, where existing models fail by averaging over user preferences, and presents LayoutMCL, which reduces Fréchet Inception Distance by 83-98% and increases diversity compared to prior methods.

Designing visually appealing layouts for multimedia documents containing text, graphs and images requires a form of creative intelligence. Modelling the generation of layouts has recently gained attention due to its importance in aesthetics and communication style. In contrast to standard prediction tasks, there are a range of acceptable layouts which depend on user preferences. For example, a poster designer may prefer logos on the top-left while another prefers logos on the bottom-right. Both are correct choices yet existing machine learning models treat layouts as a single choice prediction problem. In such situations, these models would simply average over all possible choices given the same input forming a degenerate sample. In the above example, this would form an unacceptable layout with a logo in the centre. In this paper, we present an auto-regressive neural network architecture, called LayoutMCL, that uses multi-choice prediction and winner-takes-all loss to effectively stabilise layout generation. LayoutMCL avoids the averaging problem by using multiple predictors to learn a range of possible options for each layout object. This enables LayoutMCL to generate multiple and diverse layouts from a single input which is in contrast with existing approaches which yield similar layouts with minor variations. Through quantitative benchmarks on real data (magazine, document and mobile app layouts), we demonstrate that LayoutMCL reduces Fréchet Inception Distance (FID) by 83-98% and generates significantly more diversity in comparison to existing approaches.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes