Aseem Paranjape, Ravi K. Sheth
We consider approximating the linearly evolved 2-point correlation function (2pcf) of dark matter $ξ_{\rm lin}(r;\boldsymbolθ)$ in a cosmological model with parameters $\boldsymbolθ$ as the linear combination $ξ_{\rm lin}(r;\boldsymbolθ)\approx\sum_i\,b_i(r)\,w_i(\boldsymbolθ)$, where the functions $\mathcal{B}=\{b_i(r)\}$ form a $\textit{model-agnostic basis}$ for the linear 2pcf. This decomposition is important for model-agnostic analyses of the baryon acoustic oscillation (BAO) feature in the nonlinear 2pcf of galaxies that fix $\mathcal{B}$ and leave the coefficients $\{w_i\}$ free. To date, such analyses have made simple but sub-optimal choices for $\mathcal{B}$, such as monomials. We develop a machine learning framework for systematically discovering a $\textit{minimal}$ basis $\mathcal{B}$ that describes $ξ_{\rm lin}(r)$ near the BAO feature in a wide class of cosmological models. We use a custom architecture, denoted $\texttt{BiSequential}$, for a neural network (NN) that explicitly realizes the separation between $r$ and $\boldsymbolθ$ above. The optimal NN trained on data in which only $\{Ω_{\rm m},h\}$ are varied in a $\textit{flat}$ $Λ$CDM model produces a basis $\mathcal{B}$ comprising $9$ functions capable of describing $ξ_{\rm lin}(r)$ to $\sim0.6\%$ accuracy in $\textit{curved}$ $w$CDM models varying 7 parameters within $\sim5\%$ of their fiducial, flat $Λ$CDM values. Scales such as the peak, linear point and zero-crossing of $ξ_{\rm lin}(r)$ are also recovered with very high accuracy. We compare our approach to other compression schemes in the literature, and speculate that $\mathcal{B}$ may also encompass $ξ_{\rm lin}(r)$ in modified gravity models near our fiducial $Λ$CDM model. Using our basis functions in model-agnostic BAO analyses can potentially lead to significant statistical gains.