COMLMar 2, 2020

Tropical Support Vector Machine and its Applications to Phylogenomics

arXiv:2003.00677v213 citations
AI Analysis

This work addresses the problem of analyzing high-dimensional phylogenetic data for researchers in phylogenomics, representing an incremental advancement by adapting classical SVMs to a tropical geometry framework.

The authors tackled the challenge of classifying multi-locus phylogenetic data by proposing tropical support vector machines (SVMs), which operate in the non-Euclidean space of phylogenetic trees using max-plus algebra, and demonstrated through computational experiments that their methods work well.

Most data in genome-wide phylogenetic analysis (phylogenomics) is essentially multidimensional, posing a major challenge to human comprehension and computational analysis. Also, we can not directly apply statistical learning models in data science to a set of phylogenetic trees since the space of phylogenetic trees is not Euclidean. In fact, the space of phylogenetic trees is a tropical Grassmannian in terms of max-plus algebra. Therefore, to classify multi-locus data sets for phylogenetic analysis, we propose tropical support vector machines (SVMs). Like classical SVMs, a tropical SVM is a discriminative classifier defined by the tropical hyperplane which maximizes the minimum tropical distance from data points to itself in order to separate these data points into sectors (half-spaces) in the tropical projective torus. Both hard margin tropical SVMs and soft margin tropical SVMs can be formulated as linear programming problems. We focus on classifying two categories of data, and we study a simpler case by assuming the data points from the same category ideally stay in the same sector of a tropical separating hyperplane. For hard margin tropical SVMs, we prove the necessary and sufficient conditions for two categories of data points to be separated, and we show an explicit formula for the optimal value of the feasible linear programming problem. For soft margin tropical SVMs, we develop novel methods to compute an optimal tropical separating hyperplane. Computational experiments show our methods work well. We end this paper with open problems.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes