IVAICLCVJul 20, 2024

Large-vocabulary forensic pathological analyses via prototypical cross-modal contrastive learning

arXiv:2407.14904v15 citationsh-index: 9
Originality Incremental advance
AI Analysis

This work addresses laborious and variable forensic pathology analyses for professionals, offering a novel AI tool with detailed explainability, though it is domain-specific and incremental in method.

The paper tackles the problem of variability and inefficiency in forensic pathology by developing SongCi, a visual-language model that uses prototypical cross-modal contrastive learning, achieving performance comparable to experienced forensic pathologists and surpassing existing AI models in tasks such as analyzing over 16 million image patches and 2,228 vision-language pairs.

Forensic pathology is critical in determining the cause and manner of death through post-mortem examinations, both macroscopic and microscopic. The field, however, grapples with issues such as outcome variability, laborious processes, and a scarcity of trained professionals. This paper presents SongCi, an innovative visual-language model (VLM) designed specifically for forensic pathology. SongCi utilizes advanced prototypical cross-modal self-supervised contrastive learning to enhance the accuracy, efficiency, and generalizability of forensic analyses. It was pre-trained and evaluated on a comprehensive multi-center dataset, which includes over 16 million high-resolution image patches, 2,228 vision-language pairs of post-mortem whole slide images (WSIs), and corresponding gross key findings, along with 471 distinct diagnostic outcomes. Our findings indicate that SongCi surpasses existing multi-modal AI models in many forensic pathology tasks, performs comparably to experienced forensic pathologists and significantly better than less experienced ones, and provides detailed multi-modal explainability, offering critical assistance in forensic investigations. To the best of our knowledge, SongCi is the first VLM specifically developed for forensic pathological analysis and the first large-vocabulary computational pathology (CPath) model that directly processes gigapixel WSIs in forensic science.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes