LGSPMNNov 6, 2025

SPECTRA: Spectral Target-Aware Graph Augmentation for Imbalanced Molecular Property Regression

arXiv:2511.04838v11 citationsh-index: 5
Originality Highly original
AI Analysis

This addresses the challenge of predicting properties for rare, high-value molecules in drug discovery, representing an incremental improvement with a novel method for a known bottleneck.

The paper tackles the problem of imbalanced molecular property regression, where standard Graph Neural Networks underperform on rare but valuable compounds, by introducing SPECTRA, a spectral target-aware graph augmentation framework that generates realistic molecular graphs to densify underrepresented regions, resulting in improved error in relevant target ranges while maintaining competitive overall MAE on benchmarks.

In molecular property prediction, the most valuable compounds (e.g., high potency) often occupy sparse regions of the target space. Standard Graph Neural Networks (GNNs) commonly optimize for the average error, underperforming on these uncommon but critical cases, with existing oversampling methods often distorting molecular topology. In this paper, we introduce SPECTRA, a Spectral Target-Aware graph augmentation framework that generates realistic molecular graphs in the spectral domain. SPECTRA (i) reconstructs multi-attribute molecular graphs from SMILES; (ii) aligns molecule pairs via (Fused) Gromov-Wasserstein couplings to obtain node correspondences; (iii) interpolates Laplacian eigenvalues, eigenvectors and node features in a stable share-basis; and (iv) reconstructs edges to synthesize physically plausible intermediates with interpolated targets. A rarity-aware budgeting scheme, derived from a kernel density estimation of labels, concentrates augmentation where data are scarce. Coupled with a spectral GNN using edge-aware Chebyshev convolutions, SPECTRA densifies underrepresented regions without degrading global accuracy. On benchmarks, SPECTRA consistently improves error in relevant target ranges while maintaining competitive overall MAE, and yields interpretable synthetic molecules whose structure reflects the underlying spectral geometry. Our results demonstrate that spectral, geometry-aware augmentation is an effective and efficient strategy for imbalanced molecular property regression.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes