CVMay 7, 2022
Synthetic Point Cloud Generation for Class Segmentation ApplicationsMaria Gonzalez Stefanelli, Avi Rajesh Jain, Sandeep Kamal Jalui et al.
Maintenance of industrial facilities is a growing hazard due to the cumbersome process needed to identify infrastructure degradation. Digital Twins have the potential to improve maintenance by monitoring the continuous digital representation of infrastructure. However, the time needed to map the existing geometry makes their use prohibitive. We previously developed class segmentation algorithms to automate digital twinning, however a vast amount of annotated point clouds is needed. Currently, synthetic data generation for automated segmentation is non-existent. We used Helios++ to automatically segment point clouds from 3D models. Our research has the potential to pave the ground for efficient industrial class segmentation.
IRMay 1
Negative Data Mining for Contrastive Learning in Dense Retrieval at IKEA.comEva Agapaki, Amritpal Singh Gill
Contrastive learning is a core component of modern retrieval systems, but its effectiveness heavily relies on the quality of negative examples used during training. In this work, we present a systematic approach to improving dense retrieval for IKEA product search through structured negative sampling strategies and scalable LLM-as-a-judge relevance evaluation. Building on IKEA Search Engine's late-interaction retrieval architectures, we introduce two key contributions: (1) structured negative sampling strategies that leverage product hierarchical taxonomy and product attributes to generate semantically challenging negatives, and (2) a comprehensive LLM-based evaluation methodology for generating training data. Rather than relying on sparse human annotations or random sampling, our LLM-based evaluation system allocates a score for all candidate products against each query. Our methodology achieves +2.6\% average category accuracy on offline real user query experiments on the Canada market. However, our A/B test on long-tail queries showed no statistically significant differences in user engagement metrics between the improved and baseline models ($p > 0.05$). We trace this gap to user search behavior: 67\% of popular searches exhibit zero-click rates above 50\%, indicating that a substantial proportion of search sessions result in no product engagement regardless of result ranking. These findings underscore the importance of hard negative mining but also the need for grounding training data and offline evals in real user search behavior -- including query intent distribution and zero-click patterns -- to bridge the gap between offline retrieval quality and online user engagement.
CVFeb 10, 2022
Geometric Digital Twinning of Industrial Facilities: Retrieval of Industrial ShapesEva Agapaki, Ioannis Brilakis
This paper devises, implements and benchmarks a novel shape retrieval method that can accurately match individual labelled point clusters (instances) of existing industrial facilities with their respective CAD models. It employs a combination of image and point cloud deep learning networks to classify and match instances to their geometrically similar CAD model. It extends our previous research on geometric digital twin generation from point cloud data, which currently is a tedious, manual process. Experiments with our joint network reveal that it can reliably retrieve CAD models at 85.2\% accuracy. The proposed research is a fundamental framework to enable the geometric Digital Twin (gDT) pipeline and incorporate the real geometric configuration into the Digital Twin.
CYJan 5, 2021
CLOI: An Automated Benchmark Framework For Generating Geometric Digital Twins Of Industrial FacilitiesEva Agapaki, Ioannis Brilakis
This paper devises, implements and benchmarks a novel framework, named CLOI, that can accurately generate individual labelled point clusters of the most important shapes of existing industrial facilities with minimal manual effort in a generic point-level format. CLOI employs a combination of deep learning and geometric methods to segment the points into classes and individual instances. The current geometric digital twin generation from point cloud data in commercial software is a tedious, manual process. Experiments with our CLOI framework reveal that the method can reliably segment complex and incomplete point clouds of industrial facilities, yielding 82% class segmentation accuracy. Compared to the current state-of-practice, the proposed framework can realize estimated time-savings of 30% on average. CLOI is the first framework of its kind to have achieved geometric digital twinning for the most important objects of industrial factories. It provides the foundation for further research on the generation of semantically enriched digital twins of the built environment.
CVDec 24, 2020
Instance Segmentation of Industrial Point Cloud DataEva Agapaki, Ioannis Brilakis
The challenge that this paper addresses is how to efficiently minimize the cost and manual labour for automatically generating object oriented geometric Digital Twins (gDTs) of industrial facilities, so that the benefits provide even more value compared to the initial investment to generate these models. Our previous work achieved the current state-of-the-art class segmentation performance (75% average accuracy per point and average AUC 90% in the CLOI dataset classes) as presented in (Agapaki and Brilakis 2020) and directly produces labelled point clusters of the most important to model objects (CLOI classes) from laser scanned industrial data. CLOI stands for C-shapes, L-shapes, O-shapes, I-shapes and their combinations. However, the problem of automated segmentation of individual instances that can then be used to fit geometric shapes remains unsolved. We argue that the use of instance segmentation algorithms has the theoretical potential to provide the output needed for the generation of gDTs. We solve instance segmentation in this paper through (a) using a CLOI-Instance graph connectivity algorithm that segments the point clusters of an object class into instances and (b) boundary segmentation of points that improves step (a). Our method was tested on the CLOI benchmark dataset (Agapaki et al. 2019) and segmented instances with 76.25% average precision and 70% average recall per point among all classes. This proved that it is the first to automatically segment industrial point cloud shapes with no prior knowledge other than the class point label and is the bedrock for efficient gDT generation in cluttered industrial point clouds.