Learning Object Arrangements in 3D Scenes using Human Context
This addresses the challenge of scalable object arrangement for applications like robotics or interior design, though it is incremental by focusing on human context rather than object-object relationships.
The paper tackles the problem of learning object arrangements in 3D scenes by modeling human-object relationships based on affordances and reachability, achieving an average placement error of 1.6 meters and a score of 4.3/5 in real scenes.
We consider the problem of learning object arrangements in a 3D scene. The key idea here is to learn how objects relate to human poses based on their affordances, ease of use and reachability. In contrast to modeling object-object relationships, modeling human-object relationships scales linearly in the number of objects. We design appropriate density functions based on 3D spatial features to capture this. We learn the distribution of human poses in a scene using a variant of the Dirichlet process mixture model that allows sharing of the density function parameters across the same object types. Then we can reason about arrangements of the objects in the room based on these meaningful human poses. In our extensive experiments on 20 different rooms with a total of 47 objects, our algorithm predicted correct placements with an average error of 1.6 meters from ground truth. In arranging five real scenes, it received a score of 4.3/5 compared to 3.7 for the best baseline method.