NEGNApr 20

Motif Diversity in Human Liver ChIP-seq Data Using MAP-Elites

arXiv:2601.178087.2h-index: 8
Predicted impact top 84% in NE · last 90 daysOriginality Incremental advance
AI Analysis

For computational biologists, this work provides a method to uncover multiple plausible motif explanations from regulatory sequence data, addressing biological heterogeneity that single-solution approaches miss.

The authors reframe motif discovery as a quality-diversity problem using MAP-Elites, recovering multiple high-quality motif variants from human CTCF liver ChIP-seq data with fitness comparable to MEME's best solutions, while revealing structured diversity missed by single-solution methods.

Motif discovery is a core problem in computational biology, traditionally formulated as a likelihood optimization task that returns a single dominant motif from a DNA sequence dataset. However, regulatory sequence data admit multiple plausible motif explanations, reflecting underlying biological heterogeneity. In this work, we frame motif discovery as a quality-diversity problem and apply the MAP-Elites algorithm to evolve position weight matrix motifs under a likelihood-based fitness objective while explicitly preserving diversity across biologically meaningful dimensions. We evaluate MAP-Elites using three complementary behavioral characterizations that capture trade-offs between motif specificity, compositional structure, coverage, and robustness. Experiments on human CTCF liver ChIP-seq data aligned to the human reference genome compare MAP-Elites against a standard motif discovery tool, MEME, under matched evaluation criteria across stratified dataset subsets. Results show that MAP-Elites recovers multiple high-quality motif variants with fitness comparable to MEME's strongest solutions while revealing structured diversity obscured by single-solution approaches.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes