Predicting pathways for old and new metabolites through clustering
This work addresses a bottleneck in metabolomics by improving pathway prediction for researchers, though it is incremental as it builds on existing clustering methods.
The authors tackled the challenge of predicting metabolic pathways for new metabolites, which is time-consuming and only covers 60% of metabolites in databases, by using clustering on structural features to link 92% of known metabolites to their correct pathways.
The diverse metabolic pathways are fundamental to all living organisms, as they harvest energy, synthesize biomass components, produce molecules to interact with the microenvironment, and neutralize toxins. While discovery of new metabolites and pathways continues, the prediction of pathways for new metabolites can be challenging. It can take vast amounts of time to elucidate pathways for new metabolites; thus, according to HMDB only 60% of metabolites get assigned to pathways. Here, we present an approach to identify pathways based on metabolite structure. We extracted 201 features from SMILES annotations, and identified new metabolites from PubMed abstracts and HMDB. After applying clustering algorithms to both groups of features, we quantified correlations between metabolites, and found the clusters accurately linked 92% of known metabolites to their respective pathways. Thus, this approach could be valuable for predicting metabolic pathways for new metabolites.