Roey Ron

3.3CVJul 28, 2020Code

Detection and Segmentation of Custom Objects using High Distraction Photorealistic Synthetic Data

Roey Ron, Gil Elbaz

We show a straightforward and useful methodology for performing instance segmentation using synthetic data. We apply this methodology on a basic case and derived insights through quantitative analysis. We created a new public dataset: The Expo Markers Dataset intended for detection and segmentation tasks. This dataset contains 5,000 synthetic photorealistic images with their corresponding pixel-perfect segmentation ground truth. The goal is to achieve high performance on manually-gathered and annotated real-world data of custom objects. We do that by creating 3D models of the target objects and other possible distraction objects and place them within a simulated environment. Expo Markers were chosen for this task, fitting our requirements of a custom object due to the exact texture, size and 3D shape. An additional advantage is the availability of this object in offices around the world for easy testing and validation of our results. We generate the data using a domain randomization technique that also simulates other photorealistic objects in the scene, known as distraction objects. These objects provide visual complexity, occlusions, and lighting challenges to help our model gain robustness in training. We are also releasing our manually-gathered datasets used for comparison and evaluation of our synthetic dataset. This white-paper provides strong evidence that photorealistic simulated data can be used in practical real world applications as a more scalable and flexible solution than manually-captured data. Code is available at the following address: https://github.com/DataGenResearchTeam/expo_markers

13.1CVJun 18, 2025

HOIDiNi: Human-Object Interaction through Diffusion Noise Optimization

Roey Ron, Guy Tevet, Haim Sawdayee et al.

We present HOIDiNi, a text-driven diffusion framework for synthesizing realistic and plausible human-object interaction (HOI). HOI generation is extremely challenging since it induces strict contact accuracies alongside a diverse motion manifold. While current literature trades off between realism and physical correctness, HOIDiNi optimizes directly in the noise space of a pretrained diffusion model using Diffusion Noise Optimization (DNO), achieving both. This is made feasible thanks to our observation that the problem can be separated into two phases: an object-centric phase, primarily making discrete choices of hand-object contact locations, and a human-centric phase that refines the full-body motion to realize this blueprint. This structured approach allows for precise hand-object contact without compromising motion naturalness. Quantitative, qualitative, and subjective evaluations on the GRAB dataset alone clearly indicate HOIDiNi outperforms prior works and baselines in contact accuracy, physical validity, and overall quality. Our results demonstrate the ability to generate complex, controllable interactions, including grasping, placing, and full-body coordination, driven solely by textual prompts. https://hoidini.github.io.

Roey Ron

2 Papers