DB CY DC LGMar 10, 2020

Managing Data Lineage of O&G Machine Learning Models: The Sweet Spot for Shale Use Case

Raphael Thiago, Renan Souza, L. Azevedo, E. Soares, Rodrigo Santos, Wallas Santos, Max De Bayser, M. Cardoso, M. Moreno, Renato Cerqueira

arXiv:2003.04915v12.37 citations

Originality Synthesis-oriented

AI Analysis

It addresses data governance and adoption barriers in a domain-specific context, focusing on shale oil and gas production.

The paper tackles the problem of data lineage in machine learning models for the oil and gas industry, specifically for shale production, by leveraging data lineage to improve the ML lifecycle and enable sweet-spot discovery.

Machine Learning (ML) has increased its role, becoming essential in several industries. However, questions around training data lineage, such as "where has the dataset used to train this model come from?"; the introduction of several new data protection legislation; and, the need for data governance requirements, have hindered the adoption of ML models in the real world. In this paper, we discuss how data lineage can be leveraged to benefit the ML lifecycle to build ML models to discover sweet-spots for shale oil and gas production, a major application in the Oil and Gas O&G Industry.

View on arXiv PDF

Similar