CVDec 9, 2025
A Scalable Pipeline Combining Procedural 3D Graphics and Guided Diffusion for Photorealistic Synthetic Training Data Generation in White Button Mushroom SegmentationArtúr I. Károly, Péter Galambos
Industrial mushroom cultivation increasingly relies on computer vision for monitoring and automated harvesting. However, developing accurate detection and segmentation models requires large, precisely annotated datasets that are costly to produce. Synthetic data provides a scalable alternative, yet often lacks sufficient realism to generalize to real-world scenarios. This paper presents a novel workflow that integrates 3D rendering in Blender with a constrained diffusion model to automatically generate high-quality annotated, photorealistic synthetic images of Agaricus Bisporus mushrooms. This approach preserves full control over 3D scene configuration and annotations while achieving photorealism without the need for specialized computer graphics expertise. We release two synthetic datasets (each containing 6,000 images depicting over 250k mushroom instances) and evaluate Mask R-CNN models trained on them in a zero-shot setting. When tested on two independent real-world datasets (including a newly collected benchmark), our method achieves state-of-the-art segmentation performance (F1 = 0.859 on M18K), despite using only synthetic training data. Although the approach is demonstrated on Agaricus Bisporus mushrooms, the proposed pipeline can be readily adapted to other mushroom species or to other agricultural domains, such as fruit and leaf detection.
ROJun 18, 2021
Towards Robotic Laboratory Automation Plug & Play: The "LAPP" FrameworkÁdám Wolf, David Wolton, Josef Trapl et al.
Increasing the level of automation in pharmaceutical laboratories and production facilities plays a crucial role in delivering medicine to patients. However, the particular requirements of this field make it challenging to adapt cutting-edge technologies present in other industries. This article provides an overview of relevant approaches and how they can be utilized in the pharmaceutical industry, especially in development laboratories. Recent advancements include the application of flexible mobile manipulators capable of handling complex tasks. However, integrating devices from many different vendors into an end-to-end automation system is complicated due to the diversity of interfaces. Therefore, various approaches for standardization are considered in this article, and a concept is proposed for taking them a step further. This concept enables a mobile manipulator with a vision system to "learn" the pose of each device and - utilizing a barcode - fetch interface information from a universal cloud database. This information includes control and communication protocol definitions and a representation of robot actions needed to operate the device. In order to define the movements in relation to the device, devices have to feature - besides the barcode - a fiducial marker as standard. The concept will be elaborated following appropriate research activities in follow-up papers.