Panoptic Segmentation using Synthetic and Real Data
This addresses the challenge of data scarcity in industrial vision systems for tasks like object interaction understanding, though it is incremental as it builds on existing synthetic data methods.
The paper tackles the problem of reducing the need for large quantities of domain-specific labeled data in panoptic segmentation for industrial scenarios by proposing a pipeline to generate synthetic images from 3D models, which are automatically labeled and used alongside a small set of real images for fine-tuning, resulting in drastically reduced real image requirements for reasonable performance.
Being able to understand the relations between the user and the surrounding environment is instrumental to assist users in a worksite. For instance, understanding which objects a user is interacting with from images and video collected through a wearable device can be useful to inform the worker on the usage of specific objects in order to improve productivity and prevent accidents. Despite modern vision systems can rely on advanced algorithms for object detection, semantic and panoptic segmentation, these methods still require large quantities of domain-specific labeled data, which can be difficult to obtain in industrial scenarios. Motivated by this observation, we propose a pipeline which allows to generate synthetic images from 3D models of real environments and real objects. The generated images are automatically labeled and hence effortless to obtain. Exploiting the proposed pipeline, we generate a dataset comprising synthetic images automatically labeled for panoptic segmentation. This set is complemented by a small number of manually labeled real images for fine-tuning. Experiments show that the use of synthetic images allows to drastically reduce the number of real images needed to obtain reasonable panoptic segmentation performance.