IV CVSep 3, 2022

Classification of Breast Tumours Based on Histopathology Images Using Deep Features and Ensemble of Gradient Boosting Methods

Mohammad Reza Abbasniya, Sayed Ali Sheikholeslamzadeh, Hamid Nasiri, Samaneh Emami

arXiv:2209.01380v16.617 citationsh-index: 19Has Code

Originality Synthesis-oriented

AI Analysis

This work addresses early-stage breast cancer diagnosis for medical applications, but it is incremental as it focuses on improving classification with existing methods rather than introducing new paradigms.

The paper tackled breast cancer classification from histopathology images by using deep feature transfer learning with Inception-ResNet-v2 and an ensemble of gradient boosting methods (CatBoost, XGBoost, LightGBM), achieving accuracies of 96.82%, 95.84%, 97.01%, and 96.15% across different magnifications on the BreakHis dataset.

Breast cancer is the most common cancer among women worldwide. Early-stage diagnosis of breast cancer can significantly improve the efficiency of treatment. Computer-aided diagnosis (CAD) systems are widely adopted in this issue due to their reliability, accuracy and affordability. There are different imaging techniques for a breast cancer diagnosis; one of the most accurate ones is histopathology which is used in this paper. Deep feature transfer learning is used as the main idea of the proposed CAD system's feature extractor. Although 16 different pre-trained networks have been tested in this study, our main focus is on the classification phase. The Inception-ResNet-v2 which has both residual and inception networks profits together has shown the best feature extraction capability in the case of breast cancer histopathology images among all tested CNNs. In the classification phase, the ensemble of CatBoost, XGBoost and LightGBM has provided the best average accuracy. The BreakHis dataset was used to evaluate the proposed method. BreakHis contains 7909 histopathology images (2,480 benign and 5,429 malignant) in four magnification factors. The proposed method's accuracy (IRv2-CXL) using 70% of BreakHis dataset as training data in 40x, 100x, 200x and 400x magnification is 96.82%, 95.84%, 97.01% and 96.15%, respectively. Most studies on automated breast cancer detection have focused on feature extraction, which made us attend to the classification phase. IRv2-CXL has shown better or comparable results in all magnifications due to using the soft voting ensemble method which could combine the advantages of CatBoost, XGBoost and LightGBM together.

View on arXiv PDF Code

Similar