confidence-planner: Easy-to-Use Prediction Confidence Estimation and Sample Size Planning
This tool addresses the growing scrutiny from lawmakers and funding agencies for uncertainty estimation in ML applications impacting society, particularly in medicine and social sciences, though it is incremental as it builds on existing methods.
The authors tackled the need for statistical uncertainty estimation in machine learning by developing an easy-to-use Python package and web application that provides eight procedures for estimating prediction confidence intervals and planning sample sizes, integrating seamlessly with established data analysis libraries.
Machine learning applications, especially in the fields of me\-di\-cine and social sciences, are slowly being subjected to increasing scrutiny. Similarly to sample size planning performed in clinical and social studies, lawmakers and funding agencies may expect statistical uncertainty estimations in machine learning applications that impact society. In this paper, we present an easy-to-use python package and web application for estimating prediction confidence intervals. The package offers eight different procedures to determine and justify the sample size and confidence of predictions from holdout, bootstrap, cross-validation, and progressive validation experiments. Since the package builds directly on established data analysis libraries, it seamlessly integrates into preprocessing and exploratory data analysis steps. Code related to this paper is available at: https://github.com/dabrze/confidence-planner.