SE AIMar 11, 2017

Learning best K analogies from data distribution for case-based software effort estimation

arXiv:1703.04567v111.418 citations

Originality Incremental advance

AI Analysis

This incremental improvement addresses a key challenge in software effort estimation for developers and project managers by automating case selection.

The paper tackles the problem of selecting the optimal number of similar cases in Case-Based Reasoning for software effort estimation by proposing a bisecting k-medoids clustering technique to understand dataset structure and exclude irrelevant cases, resulting in better performance than regular K-based CBR methods.

Case-Based Reasoning (CBR) has been widely used to generate good software effort estimates. The predictive performance of CBR is a dataset dependent and subject to extremely large space of configuration possibilities. Regardless of the type of adaptation technique, deciding on the optimal number of similar cases to be used before applying CBR is a key challenge. In this paper we propose a new technique based on Bisecting k-medoids clustering algorithm to better understanding the structure of a dataset and discovering the the optimal cases for each individual project by excluding irrelevant cases. Results obtained showed that understanding of the data characteristic prior prediction stage can help in automatically finding the best number of cases for each test project. Performance figures of the proposed estimation method are better than those of other regular K-based CBR methods.

View on arXiv PDF

Similar