CVNov 25, 2024

Cluster-based human-in-the-loop strategy for improving machine learning-based circulating tumor cell detection in liquid biopsy

arXiv:2411.16332v15 citationsh-index: 69Patterns
Originality Incremental advance
AI Analysis

This work addresses the problem of insufficient labeled data for automated CTC detection in cancer patients, offering an incremental improvement to existing methods.

The study tackled the challenge of improving machine learning-based detection of circulating tumor cells in liquid biopsy by introducing a human-in-the-loop strategy that combines self-supervised deep learning with iterative targeted sampling based on cluster performance, demonstrating advantages over random sampling for metastatic breast cancer data.

Detection and differentiation of circulating tumor cells (CTCs) and non-CTCs in blood draws of cancer patients pose multiple challenges. While the gold standard relies on tedious manual evaluation of an automatically generated selection of images, machine learning (ML) techniques offer the potential to automate these processes. However, human assessment remains indispensable when the ML system arrives at uncertain or wrong decisions due to an insufficient set of labeled training data. This study introduces a human-in-the-loop (HiL) strategy for improving ML-based CTC detection. We combine self-supervised deep learning and a conventional ML-based classifier and propose iterative targeted sampling and labeling of new unlabeled training samples by human experts. The sampling strategy is based on the classification performance of local latent space clusters. The advantages of the proposed approach compared to naive random sampling are demonstrated for liquid biopsy data from patients with metastatic breast cancer.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes