IV CV LGMay 16, 2023

CB-HVTNet: A channel-boosted hybrid vision transformer network for lymphocyte assessment in histopathological images

Momina Liaqat Ali, Zunaira Rauf, Asifullah Khan, Anabia Sohail, Rafi Ullah, Jeonghwan Gwak

arXiv:2305.09211v37.36 citations

Originality Incremental advance

AI Analysis

This work addresses a domain-specific problem for pathologists by providing a more accurate tool for medical image analysis, though it is incremental as it builds on existing hybrid vision transformer methods.

The paper tackled the problem of lymphocyte assessment in histopathological images by proposing a channel-boosted hybrid vision transformer network (CB-HVTNet) that combines transformers and CNNs to improve feature representation, achieving state-of-the-art performance on two public datasets with good generalization ability.

Transformers, due to their ability to learn long range dependencies, have overcome the shortcomings of convolutional neural networks (CNNs) for global perspective learning. Therefore, they have gained the focus of researchers for several vision related tasks including medical diagnosis. However, their multi-head attention module only captures global level feature representations, which is insufficient for medical images. To address this issue, we propose a Channel Boosted Hybrid Vision Transformer (CB HVT) that uses transfer learning to generate boosted channels and employs both transformers and CNNs to analyse lymphocytes in histopathological images. The proposed CB HVT comprises five modules, including a channel generation module, channel exploitation module, channel merging module, region-aware module, and a detection and segmentation head, which work together to effectively identify lymphocytes. The channel generation module uses the idea of channel boosting through transfer learning to extract diverse channels from different auxiliary learners. In the CB HVT, these boosted channels are first concatenated and ranked using an attention mechanism in the channel exploitation module. A fusion block is then utilized in the channel merging module for a gradual and systematic merging of the diverse boosted channels to improve the network's learning representations. The CB HVT also employs a proposal network in its region aware module and a head to effectively identify objects, even in overlapping regions and with artifacts. We evaluated the proposed CB HVT on two publicly available datasets for lymphocyte assessment in histopathological images. The results show that CB HVT outperformed other state of the art detection models, and has good generalization ability, demonstrating its value as a tool for pathologists.

View on arXiv PDF

Similar