SEJan 17, 2014

Lessons Learned and Results from Applying Data-Driven Cost Estimation to Industrial Data Sets

arXiv:1401.4256v1
Originality Synthesis-oriented
AI Analysis

This work addresses the challenge of applying data-driven cost estimation in industry with imperfect data, but it is incremental as it builds on existing methods and focuses on practical case study insights.

The study applied the Optimized Set Reduction (OSR(c)) method to industrial cost estimation data at Toshiba, finding that estimation accuracy varied significantly based on data sets and preprocessing techniques.

The increasing availability of cost-relevant data in industry allows companies to apply data-intensive estimation methods. However, available data are often inconsistent, invalid, or incomplete, so that most of the existing data-intensive estimation methods cannot be applied. Only few estimation methods can deal with imperfect data to a certain extent (e.g., Optimized Set Reduction, OSR(c)). Results from evaluating these methods in practical environments are rare. This article describes a case study on the application of OSR(c) at Toshiba Information Systems (Japan) Corporation. An important result of the case study is that estimation accuracy significantly varies with the data sets used and the way of preprocessing these data. The study supports current results in the area of quantitative cost estimation and clearly illustrates typical problems. Experiences, lessons learned, and recommendations with respect to data preprocessing and data-intensive cost estimation in general are presented.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes