PDPK: A Framework to Synthesise Process Data and Corresponding Procedural Knowledge for Manufacturing
This provides a baseline tool for researchers in manufacturing AI to generate and compare procedural knowledge datasets, though it is incremental as it synthesizes rather than collects real data.
The authors tackled the lack of public datasets combining process data and procedural knowledge in manufacturing by developing a framework to generate synthetic datasets, which they validated against real-world data and used to benchmark embedding methods for knowledge representation.
Procedural knowledge describes how to accomplish tasks and mitigate problems. Such knowledge is commonly held by domain experts, e.g. operators in manufacturing who adjust parameters to achieve quality targets. To the best of our knowledge, no real-world datasets containing process data and corresponding procedural knowledge are publicly available, possibly due to corporate apprehensions regarding the loss of knowledge advances. Therefore, we provide a framework to generate synthetic datasets that can be adapted to different domains. The design choices are inspired by two real-world datasets of procedural knowledge we have access to. Apart from containing representations of procedural knowledge in Resource Description Framework (RDF)-compliant knowledge graphs, the framework simulates parametrisation processes and provides consistent process data. We compare established embedding methods on the resulting knowledge graphs, detailing which out-of-the-box methods have the potential to represent procedural knowledge. This provides a baseline which can be used to increase the comparability of future work. Furthermore, we validate the overall characteristics of a synthesised dataset by comparing the results to those achievable on a real-world dataset. The framework and evaluation code, as well as the dataset used in the evaluation, are available open source.