CELGDec 13, 2016

Identification of Cancer Patient Subgroups via Smoothed Shortest Path Graph Kernel

arXiv:1612.04431v21 citations
AI Analysis

This work addresses the challenge of identifying cancer patient subgroups for improved diagnosis and treatment, though it is incremental as it builds on existing graph kernel and clustering methods.

The authors tackled the problem of refining cancer subtypes by clustering patients based on mutational profiles using a novel graph kernel, achieving up to 88% accuracy on simulated data and identifying ovarian cancer subgroups with significantly different survival times (p-value ≤ 0.005).

Characterizing patient somatic mutations through next-generation sequencing technologies opens up possibilities for refining cancer subtypes. However, catalogues of mutations reveal that only a small fraction of the genes are altered frequently in patients. On the other hand different genomic alterations may perturb the same pathways. We propose a novel clustering procedure that quantifies the similarities of patients from their mutational profile on pathways via a novel graph kernel. We represent each KEGG pathway as an undirected graph. For each patient the vertex labels are assigned based on her altered genes. Smoothed shortest path graph kernel (smSPK) evaluates each pair of patients by comparing their vertex labeled pathway graphs. Our clustering procedure involves two steps: the smSPK kernel matrix derived for each pathway are input to kernel k-means algorithm and each pathway is evaluated individually. In the next step, only those pathways that are successful are combined in to a single kernel input to kernel k-means to stratify patients. Evaluating the procedure on simulated data showed that smSPK clusters patients up to 88\% accuracy. Finally to identify ovarian cancer patient subgroups, we apply our methodology to the cancer genome atlas ovarian data that involves 481 patients. The identified subgroups are evaluated through survival analysis. Grouping patients into four clusters results with patients groups that are significantly different in their survival times ($p$-value $\le 0.005$).

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes