Tokensome: Towards a Genetic Vision-Language GPT for Explainable and Cognitive Karyotyping
This work addresses the need for interpretable and effective karyotyping methods in clinical settings, representing an incremental advancement by integrating existing technologies like knowledge graphs and LLMs.
The paper tackles the problem of automatic karyotype analysis by moving beyond visual perception to incorporate cognitive decision-making, resulting in enhanced explainability and improved abnormality detection.
Automatic karyotype analysis is often defined as a visual perception task focused solely on chromosomal object-level modeling. This definition has led most existing methods to overlook componential and holistic information, significantly constraining model performance. Moreover, the lack of interpretability in current technologies hinders clinical adoption. In this paper, we introduce Tokensome, a novel vision-language model based on chromosome tokenization for explainable and cognitive karyotyping. Tokensome elevates the method from the conventional visual perception layer to the cognitive decision-making layer. This elevation enables the integration of domain knowledge and cognitive reasoning via knowledge graphs and LLMs, markedly enhancing model's explainability and facilitating abnormality detection.