AssetField: Assets Mining and Reconfiguration in Ground Feature Plane Representation
This addresses the challenge of efficient object editing and scene reconfiguration in neural scene representations, particularly for environments with repetitive structures, though it is incremental as it builds on existing neural representation methods.
The paper tackles the problem of editing and reconfiguring scenes in neural representations by proposing AssetField, which learns object-aware ground feature planes and constructs an unsupervised asset library. The result is a method that achieves competitive novel-view synthesis and generates realistic renderings for new scene configurations.
Both indoor and outdoor environments are inherently structured and repetitive. Traditional modeling pipelines keep an asset library storing unique object templates, which is both versatile and memory efficient in practice. Inspired by this observation, we propose AssetField, a novel neural scene representation that learns a set of object-aware ground feature planes to represent the scene, where an asset library storing template feature patches can be constructed in an unsupervised manner. Unlike existing methods which require object masks to query spatial points for object editing, our ground feature plane representation offers a natural visualization of the scene in the bird-eye view, allowing a variety of operations (e.g. translation, duplication, deformation) on objects to configure a new scene. With the template feature patches, group editing is enabled for scenes with many recurring items to avoid repetitive work on object individuals. We show that AssetField not only achieves competitive performance for novel-view synthesis but also generates realistic renderings for new scene configurations.