LG PR QM MLOct 24, 2025

Boltzmann Graph Ensemble Embeddings for Aptamer Libraries

Starlika Bauskar, Jade Jiao, Narayanan Kannan, Alexander Kimm, Justin M. Baker, Matthew J. Tyler, Andrea L. Bertozzi, Anne M. Andrews

arXiv:2510.21980v1h-index: 2

Originality Incremental advance

AI Analysis

This work addresses the challenge of identifying aptamers with high ligand affinity in biochemistry, particularly for low-abundance candidates, though it appears incremental as it builds on existing graph-based methods.

The paper tackled the problem of predicting aptamer-ligand affinity from SELEX datasets, where experimental biases obscure true binding strengths, by introducing a Boltzmann-weighted ensemble embedding for molecules. The result showed that this embedding enables robust community detection and subgraph-level explanations for affinity, even with biased observations.

Machine-learning methods in biochemistry commonly represent molecules as graphs of pairwise intermolecular interactions for property and structure predictions. Most methods operate on a single graph, typically the minimal free energy (MFE) structure, for low-energy ensembles (conformations) representative of structures at thermodynamic equilibrium. We introduce a thermodynamically parameterized exponential-family random graph (ERGM) embedding that models molecules as Boltzmann-weighted ensembles of interaction graphs. We evaluate this embedding on SELEX datasets, where experimental biases (e.g., PCR amplification or sequencing noise) can obscure true aptamer-ligand affinity, producing anomalous candidates whose observed abundance diverges from their actual binding strength. We show that the proposed embedding enables robust community detection and subgraph-level explanations for aptamer ligand affinity, even in the presence of biased observations. This approach may be used to identify low-abundance aptamer candidates for further experimental evaluation.

View on arXiv PDF

Similar