RGMIM: Region-Guided Masked Image Modeling for Learning Meaningful Representations from X-Ray Images
This addresses lung disease detection from X-ray images, particularly benefiting scenarios with limited training data, but it is incremental as it builds on existing masked image modeling techniques.
The paper tackled the problem of learning meaningful representations from X-ray images by proposing RGMIM, a region-guided masked image modeling method that uses organ masks to identify valid regions, resulting in a lung disease detection accuracy of 0.962 and significant improvements with limited training data.
In this study, we propose a novel method called region-guided masked image modeling (RGMIM) for learning meaningful representations from X-ray images. Our method adopts a new masking strategy that utilizes organ mask information to identify valid regions for learning more meaningful representations. We conduct quantitative evaluations on an open lung X-ray image dataset as well as masking ratio hyperparameter studies. When using the entire training set, RGMIM outperformed other comparable methods, achieving a 0.962 lung disease detection accuracy. Specifically, RGMIM significantly improved performance in small data volumes, such as 5% and 10% of the training set compared to other methods. RGMIM can mask more valid regions, facilitating the learning of discriminative representations and the subsequent high-accuracy lung disease detection. RGMIM outperforms other state-of-the-art self-supervised learning methods in experiments, particularly when limited training data is used.