CVMar 19

Improved Convex Decomposition with Ensembling and Negative Primitives

arXiv:2405.1956947.1h-index: 16
Predicted impact top 73% in CV · last 90 daysOriginality Incremental advance
AI Analysis

This work addresses the challenge of accurately abstracting complex scenes into primitives for computer vision applications, representing an incremental advance with specific gains.

The paper tackles the problem of scene decomposition into geometric primitives by introducing negative primitives and ensembling to determine their numbers, resulting in substantial improvements in depth representation and segmentation over state-of-the-art methods on the NYUv2 dataset.

Describing a scene in terms of primitives -- geometrically simple shapes that offer a parsimonious but accurate abstraction of structure -- is an established and difficult fitting problem. Different scenes require different numbers of primitives, and these primitives interact strongly. Existing methods are evaluated by comparing predicted depth, normals, and segmentation against ground truth. The state of the art method involves a learned regression procedure to predict a start point consisting of a fixed number of primitives, followed by a descent method to refine the geometry and remove redundant primitives. CSG (Constructive Solid Geometry) representations are significantly enhanced by a set-differencing operation. Our representation incorporates negative primitives, which are differenced from the positive primitives. These notably enrich the geometry that the model can encode, while complicating the fitting problem. This paper presents a method that can (a) incorporate these negative primitives and (b) choose the overall number of positive and negative primitives by ensembling. Extensive experiments on the standard NYUv2 dataset confirm that (a) this approach results in substantial improvements in depth representation and segmentation over SOTA and (b) negative primitives improve fitting accuracy. Our method is robustly applicable across datasets: in a first, we evaluate primitive prediction for LAION images.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes