IV CV QMNov 24, 2025

Development of a fully deep learning model to improve the reproducibility of sector classification systems for predicting unerupted maxillary canine likelihood of impaction

Marzio Galdi, Davide Cannatà, Flavia Celentano, Luigia Rizzo, Domenico Rossi, Tecla Bocchino, Stefano Martina

arXiv:2511.20493v1

Originality Synthesis-oriented

AI Analysis

This addresses the issue of inconsistent diagnoses among dental professionals, but it is incremental as it applies existing deep learning methods to a specific medical imaging task.

The study tackled the problem of low reproducibility in sector classification systems for predicting maxillary canine impaction by developing a deep learning model, which achieved 76.8% accuracy in classifying impacted canines across three systems.

Objectives. The aim of the present study was to develop a fully deep learning model to reduce the intra- and inter-operator reproducibility of sector classification systems for predicting unerupted maxillary canine likelihood of impaction. Methods. Three orthodontists (Os) and three general dental practitioners (GDPs) classified the position of unerupted maxillary canines on 306 radiographs (T0) according to the three different sector classification systems (5-, 4-, and 3-sector classification system). The assessment was repeated after four weeks (T1). Intra- and inter-observer agreement were evaluated with Cohen's K and Fleiss K, and between group differences with a z-test. The same radiographs were tested on different artificial intelligence (AI) models, pre-trained on an extended dataset of 1,222 radiographs. The best-performing model was identified based on its sensitivity and precision. Results. The 3-sector system was found to be the classification method with highest reproducibility, with an agreement (Cohen's K values) between observations (T0 versus T1) for each examiner ranged from 0.80 to 0.92, and an overall agreement of 0.85 [95% confidence interval (CI) = 0.83-0.87]. The overall inter-observer agreement (Fleiss K) ranged from 0.69 to 0.7. The educational background did not affect either intra- or inter-observer agreement (p>0.05). DenseNet121 proved to be the best-performing model in allocating impacted canines in the three different classes, with an overall accuracy of 76.8%. Conclusion. AI models can be designed to automatically classify the position of unerupted maxillary canines.

View on arXiv PDF

Similar