Chop & Learn: Recognizing and Generating Object-State Compositions
This work addresses the challenge of generalizing to unseen object-state compositions for researchers in computer vision, though it appears incremental as it builds on existing compositional learning methods with a new dataset.
The authors tackled the problem of recognizing and generating object-state compositions, such as cutting objects in various styles, by introducing the Chop & Learn benchmark suite and a new compositional image generation task, achieving results that enable style transfer to different objects and support video-based action recognition.
Recognizing and generating object-state compositions has been a challenging task, especially when generalizing to unseen compositions. In this paper, we study the task of cutting objects in different styles and the resulting object state changes. We propose a new benchmark suite Chop & Learn, to accommodate the needs of learning objects and different cut styles using multiple viewpoints. We also propose a new task of Compositional Image Generation, which can transfer learned cut styles to different objects, by generating novel object-state images. Moreover, we also use the videos for Compositional Action Recognition, and show valuable uses of this dataset for multiple video tasks. Project website: https://chopnlearn.github.io.