Metastatic Breast Cancer Prognostication Through Multimodal Integration of Dimensionality Reduction Algorithms and Classification Algorithms
This work addresses the high mislabeling rates (over 60%) in manual classification by pathologists, which is critical for improving diagnosis and saving lives in metastatic cancer care, though it appears incremental as it applies existing ML methods to this domain.
The study tackled the problem of detecting metastatic breast cancer in pathology scans by integrating dimensionality reduction algorithms (PCA and genetic algorithm) with classification algorithms (logistic regression, decision tree, and k-nearest neighbors), achieving a highest accuracy of 71.14% with a pipeline combining PCA, genetic algorithm, and k-nearest neighbors.
Machine learning (ML) is a branch of Artificial Intelligence (AI) where computers analyze data and find patterns in the data. The study focuses on the detection of metastatic cancer using ML. Metastatic cancer is the point where the cancer has spread to other parts of the body and is the cause of approximately 90% of cancer related deaths. Normally, pathologists spend hours each day to manually classify whether tumors are benign or malignant. This tedious task contributes to mislabeling metastasis being over 60% of time and emphasizes the importance to be aware of human error, and other inefficiencies. ML is a good candidate to improve the correct identification of metastatic cancer saving thousands of lives and can also improve the speed and efficiency of the process thereby taking less resources and time. So far, deep learning methodology of AI has been used in the research to detect cancer. This study is a novel approach to determine the potential of using preprocessing algorithms combined with classification algorithms in detecting metastatic cancer. The study used two preprocessing algorithms: principal component analysis (PCA) and the genetic algorithm to reduce the dimensionality of the dataset, and then used three classification algorithms: logistic regression, decision tree classifier, and k-nearest neighbors to detect metastatic cancer in the pathology scans. The highest accuracy of 71.14% was produced by the ML pipeline comprising of PCA, the genetic algorithm, and the k-nearest neighbors algorithm, suggesting that preprocessing and classification algorithms have great potential for detecting metastatic cancer.