GN CE LG MLDec 2, 2014

Learning interpretable models of phenotypes from whole genome sequences with the Set Covering Machine

Alexandre Drouin, Sébastien Giguère, Vladana Sagatovich, Maxime Déraspe, François Laviolette, Mario Marchand, Jacques Corbeil

arXiv:1412.1074v19 citations

Originality Synthesis-oriented

AI Analysis

This work addresses the need for interpretable models in genomics, particularly for predicting antibiotic resistance in a human pathogen, though it appears incremental as it applies an existing method to a new domain.

The authors tackled the problem of learning interpretable models for discrete phenotypes from whole genome sequences, using the Set Covering Machine with a k-mer representation, and demonstrated that extremely sparse and biologically relevant models can be learned, as shown in predicting antibiotic resistance in Pseudomonas aeruginosa for 4 antibiotics.

The increased affordability of whole genome sequencing has motivated its use for phenotypic studies. We address the problem of learning interpretable models for discrete phenotypes from whole genomes. We propose a general approach that relies on the Set Covering Machine and a k-mer representation of the genomes. We show results for the problem of predicting the resistance of Pseudomonas Aeruginosa, an important human pathogen, against 4 antibiotics. Our results demonstrate that extremely sparse models which are biologically relevant can be learnt using this approach.

View on arXiv PDF

Similar