SYLGNov 11, 2022

Data Quality Over Quantity: Pitfalls and Guidelines for Process Analytics

arXiv:2211.06440v25 citationsh-index: 27
Originality Synthesis-oriented
AI Analysis

This work addresses data quality issues for practitioners in industrial process control and analytics, offering incremental guidelines to improve real-world AI applications.

The paper tackles the problem of data acquisition and preparation in industrial process analytics, emphasizing that data quality has a greater impact on real-world AI success than complex modeling techniques, and provides best practices for pre-processing industrial time series data to develop reliable soft sensors.

A significant portion of the effort involved in advanced process control, process analytics, and machine learning involves acquiring and preparing data. Literature often emphasizes increasingly complex modelling techniques with incremental performance improvements. However, when industrial case studies are published they often lack important details on data acquisition and preparation. Although data pre-processing is unfairly maligned as trivial and technically uninteresting, in practice it has an out-sized influence on the success of real-world artificial intelligence applications. This work describes best practices for acquiring and preparing operating data to pursue data-driven modelling and control opportunities in industrial processes. We present practical considerations for pre-processing industrial time series data to inform the efficient development of reliable soft sensors that provide valuable process insights.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes