Automatic Construction of Real-World Datasets for 3D Object Localization using Two Cameras
This addresses the challenge of creating labeled datasets for 3D object localization, which is difficult due to the inability to manually assign position labels, though it is incremental as it builds on existing robotics and computer vision techniques.
The paper tackles the problem of generating supervision for precise 3D object localization by proposing a method to create large real-world datasets using an industrial robot to automatically label object positions, applied to generate a screw-driver localization dataset with stereo images.
Unlike classification, position labels cannot be assigned manually by humans. For this reason, generating supervision for precise object localization is a hard task. This paper details a method to create large datasets for 3D object localization, with real world images, using an industrial robot to generate position labels. By knowledge of the geometry of the robot, we are able to automatically synchronize the images of the two cameras and the object 3D position. We applied it to generate a screw-driver localization dataset with stereo images, using a KUKA LBR iiwa robot. This dataset could then be used to train a CNN regressor to learn end-to-end stereo object localization from a set of two standard uncalibrated cameras.