IV CVDec 1, 2021

Multi-task fusion for improving mammography screening data classification

Maria Wimmer, Gert Sluiter, David Major, Dimitrios Lenis, Astrid Berg, Theresa Neubauer, Katja Bühler

arXiv:2112.01320v118 citations

Originality Incremental advance

AI Analysis

This work addresses the need for more accurate and comprehensive patient-level predictions in mammography screening to support radiologists, though it is incremental as it builds on existing model ensembling strategies.

The paper tackled the problem of improving mammography screening classification by proposing a multi-task fusion pipeline that combines predictions and features from task-specific models, achieving AUC scores of 0.962 for any lesion and 0.791 for malignant lesions on patient-level data.

Machine learning and deep learning methods have become essential for computer-assisted prediction in medicine, with a growing number of applications also in the field of mammography. Typically these algorithms are trained for a specific task, e.g., the classification of lesions or the prediction of a mammogram's pathology status. To obtain a comprehensive view of a patient, models which were all trained for the same task(s) are subsequently ensembled or combined. In this work, we propose a pipeline approach, where we first train a set of individual, task-specific models and subsequently investigate the fusion thereof, which is in contrast to the standard model ensembling strategy. We fuse model predictions and high-level features from deep learning models with hybrid patient models to build stronger predictors on patient level. To this end, we propose a multi-branch deep learning model which efficiently fuses features across different tasks and mammograms to obtain a comprehensive patient-level prediction. We train and evaluate our full pipeline on public mammography data, i.e., DDSM and its curated version CBIS-DDSM, and report an AUC score of 0.962 for predicting the presence of any lesion and 0.791 for predicting the presence of malignant lesions on patient level. Overall, our fusion approaches improve AUC scores significantly by up to 0.04 compared to standard model ensembling. Moreover, by providing not only global patient-level predictions but also task-specific model results that are related to radiological features, our pipeline aims to closely support the reading workflow of radiologists.

View on arXiv PDF

Similar