QMCVIVJul 21, 2025

A tissue and cell-level annotated H&E and PD-L1 histopathology image dataset in non-small cell lung cancer

arXiv:2507.16855v11 citations
Originality Synthesis-oriented
AI Analysis

This dataset addresses a gap for researchers in computational pathology by enabling biomarker development for immunotherapy response in NSCLC, though it is incremental as it builds on existing data collection efforts.

The authors tackled the lack of comprehensive digital pathology datasets for non-small cell lung cancer (NSCLC) by introducing the IGNITE data toolkit, which includes 887 annotated regions from 155 patients across three tasks, such as tissue segmentation and PD-L1 detection, providing the first public dataset with manual annotations for metastatic sites and PD-L1 IHC.

The tumor immune microenvironment (TIME) in non-small cell lung cancer (NSCLC) histopathology contains morphological and molecular characteristics predictive of immunotherapy response. Computational quantification of TIME characteristics, such as cell detection and tissue segmentation, can support biomarker development. However, currently available digital pathology datasets of NSCLC for the development of cell detection or tissue segmentation algorithms are limited in scope, lack annotations of clinically prevalent metastatic sites, and forgo molecular information such as PD-L1 immunohistochemistry (IHC). To fill this gap, we introduce the IGNITE data toolkit, a multi-stain, multi-centric, and multi-scanner dataset of annotated NSCLC whole-slide images. We publicly release 887 fully annotated regions of interest from 155 unique patients across three complementary tasks: (i) multi-class semantic segmentation of tissue compartments in H&E-stained slides, with 16 classes spanning primary and metastatic NSCLC, (ii) nuclei detection, and (iii) PD-L1 positive tumor cell detection in PD-L1 IHC slides. To the best of our knowledge, this is the first public NSCLC dataset with manual annotations of H&E in metastatic sites and PD-L1 IHC.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes