IVCVNov 6, 2023

FATE: Feature-Agnostic Transformer-based Encoder for learning generalized embedding spaces in flow cytometry data

arXiv:2311.03314v16 citationsh-index: 9Has Code
Originality Incremental advance
AI Analysis

This addresses data scarcity and feature variability issues in medical diagnostics like flow cytometry, though it is incremental as it builds on existing set-transformer methods.

The authors tackled the problem of learning from data with varying feature sets, such as in flow cytometry where measured features differ between samples, by proposing a novel architecture that learns a general embedding space without requiring aligned features, and demonstrated its effectiveness for automatic cancer cell detection in acute myeloid leukemia.

While model architectures and training strategies have become more generic and flexible with respect to different data modalities over the past years, a persistent limitation lies in the assumption of fixed quantities and arrangements of input features. This limitation becomes particularly relevant in scenarios where the attributes captured during data acquisition vary across different samples. In this work, we aim at effectively leveraging data with varying features, without the need to constrain the input space to the intersection of potential feature sets or to expand it to their union. We propose a novel architecture that can directly process data without the necessity of aligned feature modalities by learning a general embedding space that captures the relationship between features across data samples with varying sets of features. This is achieved via a set-transformer architecture augmented by feature-encoder layers, thereby enabling the learning of a shared latent feature space from data originating from heterogeneous feature spaces. The advantages of the model are demonstrated for automatic cancer cell detection in acute myeloid leukemia in flow cytometry data, where the features measured during acquisition often vary between samples. Our proposed architecture's capacity to operate seamlessly across incongruent feature spaces is particularly relevant in this context, where data scarcity arises from the low prevalence of the disease. The code is available for research purposes at https://github.com/lisaweijler/FATE.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes