Extending compositional data analysis from a graph signal processing perspective
This work addresses a domain-specific problem for researchers in fields like bioinformatics and geochemistry by providing an incremental improvement over traditional compositional data analysis methods.
The paper tackles the problem of analyzing compositional data where relationships exist only between specific variables, not all pairs, by linking compositional data analysis with graph signal processing and extending Aitchison geometry to consider only selected log-ratios. The result is a framework that retains scale invariance and compositional coherence, with examples from bioinformatics and geochemistry showing its usefulness compared to standard methods.
Traditional methods for the analysis of compositional data consider the log-ratios between all different pairs of variables with equal weight, typically in the form of aggregated contributions. This is not meaningful in contexts where it is known that a relationship only exists between very specific variables (e.g.~for metabolomic pathways), while for other pairs a relationship does not exist. Modeling absence or presence of relationships is done in graph theory, where the vertices represent the variables, and the connections refer to relations. This paper links compositional data analysis with graph signal processing, and it extends the Aitchison geometry to a setting where only selected log-ratios can be considered. The presented framework retains the desirable properties of scale invariance and compositional coherence. An additional extension to include absolute information is readily made. Examples from bioinformatics and geochemistry underline the usefulness of thisapproach in comparison to standard methods for compositional data analysis.