SlotLifter: Slot-guided Feature Lifting for Learning Object-centric Radiance Fields
This work addresses the problem of object-centric learning in 3D environments for computer vision and AI, representing a novel method for a known bottleneck.
The paper tackled the challenge of learning object-centric representations in 3D scenes by proposing SlotLifter, a model that jointly addresses scene reconstruction and decomposition, achieving state-of-the-art performance in scene decomposition and novel-view synthesis on multiple synthetic and real-world datasets with a large margin over existing methods.
The ability to distill object-centric abstractions from intricate visual scenes underpins human-level generalization. Despite the significant progress in object-centric learning methods, learning object-centric representations in the 3D physical world remains a crucial challenge. In this work, we propose SlotLifter, a novel object-centric radiance model addressing scene reconstruction and decomposition jointly via slot-guided feature lifting. Such a design unites object-centric learning representations and image-based rendering methods, offering state-of-the-art performance in scene decomposition and novel-view synthesis on four challenging synthetic and four complex real-world datasets, outperforming existing 3D object-centric learning methods by a large margin. Through extensive ablative studies, we showcase the efficacy of designs in SlotLifter, revealing key insights for potential future directions.