Superquadrics Revisited: Learning 3D Shape Parsing beyond Cuboids
This work addresses the need for better 3D shape parsing in computer vision, offering a more flexible and easier-to-learn method compared to traditional cuboid-based approaches.
The paper tackles the problem of abstracting complex 3D shapes with part-based representations by using superquadrics instead of cuboids, resulting in more expressive scene parses that capture fine details and complex poses, as demonstrated on ShapeNet and SURREAL datasets.
Abstracting complex 3D shapes with parsimonious part-based representations has been a long standing goal in computer vision. This paper presents a learning-based solution to this problem which goes beyond the traditional 3D cuboid representation by exploiting superquadrics as atomic elements. We demonstrate that superquadrics lead to more expressive 3D scene parses while being easier to learn than 3D cuboid representations. Moreover, we provide an analytical solution to the Chamfer loss which avoids the need for computational expensive reinforcement learning or iterative prediction. Our model learns to parse 3D objects into consistent superquadric representations without supervision. Results on various ShapeNet categories as well as the SURREAL human body dataset demonstrate the flexibility of our model in capturing fine details and complex poses that could not have been modelled using cuboids.