MarioNette: Self-Supervised Sprite Learning
This addresses the need for artists and video game designers to analyze and edit 2D animations more efficiently, though it appears incremental as it builds on self-supervised learning methods for visual patterns.
The paper tackles the problem of decomposing sprite-based video animations into a disentangled representation of recurring graphic elements, achieving a sparse, consistent, and explicit representation that can be used for editing or analysis.
Artists and video game designers often construct 2D animations using libraries of sprites -- textured patches of objects and characters. We propose a deep learning approach that decomposes sprite-based video animations into a disentangled representation of recurring graphic elements in a self-supervised manner. By jointly learning a dictionary of possibly transparent patches and training a network that places them onto a canvas, we deconstruct sprite-based content into a sparse, consistent, and explicit representation that can be easily used in downstream tasks, like editing or analysis. Our framework offers a promising approach for discovering recurring visual patterns in image collections without supervision.