Sample size determination for machine learning in medical research
This addresses a critical methodological gap for researchers in medicine using machine learning, but it appears incremental as it builds on existing sample size concepts.
The paper tackles the lack of clear guidelines for determining sample sizes in medical machine learning research by proposing a method that starts with the testing set and then calculates training and total sample sizes.
Machine learning (ML) methods are being increasingly used across various domains of medicine research. However, despite advancements in the use of ML in medicine, clear and definitive guidelines for determining sample sizes in medical ML research are lacking. This article proposes a method for determining sample sizes for medical research utilizing ML methods, beginning with the determination of the testing set sample size, followed with the determination of the training set and total sample sizes.