CVFeb 1

Effectiveness of Automatically Curated Dataset in Thyroid Nodules Classification Algorithms Using Deep Learning

Jichen Yang, Jikai Zhang, Benjamin Wildman-Tobriner, Maciej A. Mazurowski

arXiv:2602.01020v11.5

Originality Incremental advance

AI Analysis

This work addresses data scarcity in medical imaging for radiologists and researchers, but it is incremental as it builds on a prior method for dataset curation.

This study tackled the problem of limited data availability for training deep learning models in thyroid nodule classification by evaluating the effectiveness of an automatically-curated dataset, finding that it improved model performance with an AUC of 0.694 compared to 0.643 for manually annotated data.

The diagnosis of thyroid nodule cancers commonly utilizes ultrasound images. Several studies showed that deep learning algorithms designed to classify benign and malignant thyroid nodules could match radiologists' performance. However, data availability for training deep learning models is often limited due to the significant effort required to curate such datasets. The previous study proposed a method to curate thyroid nodule datasets automatically. It was tested to have a 63% yield rate and 83% accuracy. However, the usefulness of the generated data for training deep learning models remains unknown. In this study, we conducted experiments to determine whether using a automatically-curated dataset improves deep learning algorithms' performance. We trained deep learning models on the manually annotated and automatically-curated datasets. We also trained with a smaller subset of the automatically-curated dataset that has higher accuracy to explore the optimum usage of such dataset. As a result, the deep learning model trained on the manually selected dataset has an AUC of 0.643 (95% confidence interval [CI]: 0.62, 0.66). It is significantly lower than the AUC of the 6automatically-curated dataset trained deep learning model, 0.694 (95% confidence interval [CI]: 0.67, 0.73, P < .001). The AUC of the accurate subset trained deep learning model is 0.689 (95% confidence interval [CI]: 0.66, 0.72, P > .43), which is insignificantly worse than the AUC of the full automatically-curated dataset. In conclusion, we showed that using a automatically-curated dataset can substantially increase the performance of deep learning algorithms, and it is suggested to use all the data rather than only using the accurate subset.

View on arXiv PDF

Similar