Paulina Hoyos

h-index2

5papers

15citations

Novelty73%

AI Score46

Ranked #35,889 of 194,257 authors (top 18%)#8,421 in LG (top 21%)

5 Papers

9.5LGJul 9

Group Invariant Spectral Embedding

Yeari Vigder, Paulina Hoyos, David Thong et al.

Spectral embedding methods are widely used for dimensionality reduction and clustering of high-dimensional datasets with intrinsic low-dimensional structures. Although many datasets of practical interest exhibit invariance under symmetries such as rotations, standard spectral embedding methods do not account for this, treating symmetry-related data points as unrelated. Our approach to this problem is to incorporate the symmetries directly into the affinity kernels used for spectral embedding. We analyze the case of a Riemannian data manifold $M$ with symmetries given by a compact Lie group~$G$ and prove that, under suitable conditions, graph Laplacians constructed from three types of invariant kernels converge pointwise to explicit second-order differential operators on the quotient space $M/G$. Our analysis implies improved convergence rates, as the effective dimension drops according to the dimension of the group. We validate our approach on datasets with $\mathrm{SO}(2)$ or $\mathrm{SO}(3)$ symmetry, and show that $G$-invariant spectral embedding recovers the intrinsic geometry of the data, in contrast to standard spectral embedding, which fails to do so even in the limit of infinite data.

5.7LGMay 19

Group-Algebraic Tensors: Provably-optimal Equivariant Learning and Physical Symmetry Discovery

Paulina Hoyos, Shashanka Ubaru, Dongsung Huh et al.

We introduce the $\star_G$ tensor algebra, in which any finite group $G$ defines the multiplication rule, making equivariance an intrinsic algebraic property rather than an architectural constraint. The framework rests on three machine-verified theoretical pillars: (i)~an Eckart-Young optimality guarantee for the $\star_G$-SVD: the first such result for symmetry-preserving tensor approximation, exact and polynomial-time; (ii)~a Kronecker factorization that composes multiple symmetries by replacing $F_G$ with $F_{G_1} \otimes F_{G_2}$ with no architectural redesign; and (iii)~a 600-line Lean~4 formalization of the $\star_G$ algebra. The framework provides capabilities that equivariant neural networks (ENNs) structurally cannot: a closed-form per-irreducible-representation decomposition of every prediction, and data-driven discovery of the symmetry group that best fits a dataset. As a non-trivial empirical demonstration, decomposing QM9 molecular geometry over the chiral octahedral subgroup of SO(3) recovers the Wigner--Eckart selection rules of angular momentum from data alone, with no quantum mechanical input: scalar properties are A$_1$-dominated, dipole components are T$_1$-dominated, the isotropic polarizability is uniquely insensitive to $l\!=\!1$ as the rank-2-trace decomposition $l\!=\!0 \oplus l\!=\!2$ requires, and the T$_1$/A$_1$ predictive-power ratio separates vector observables from scalar observables by a factor of five. On full QM9 (130{,}831 molecules), $\star_G$-SVD with ridge regression provides closed form predictions at $\sim50-90\times$ fewer parameters than parameter-matched MLPs. Algebraic equivariance thus complements architectural equivariance not as a faster-better-cheaper alternative but as a different mathematical affordance: provably-optimal symmetry-preserving compression, per-irrep interpretability, and data-driven physical discovery.

5.3LGMar 28, 2023

Diffusion Maps for Group-Invariant Manifolds

Paulina Hoyos, Joe Kileel

In this article, we consider the manifold learning problem when the data set is invariant under the action of a compact Lie group $K$. Our approach consists in augmenting the data-induced graph Laplacian by integrating over the $K$-orbits of the existing data points, which yields a $K$-invariant graph Laplacian $L$. We prove that $L$ can be diagonalized by using the unitary irreducible representation matrices of $K$, and we provide an explicit formula for computing its eigenvalues and eigenfunctions. In addition, we show that the normalized Laplacian operator $L_N$ converges to the Laplace-Beltrami operator of the data manifold with an improved convergence rate, where the improvement grows with the dimension of the symmetry group $K$. This work extends the steerable graph Laplacian framework of Landa and Shkolnisky from the case of $\operatorname{SO}(2)$ to arbitrary compact Lie groups.

1.2SPOct 21, 2025

SO(3)-invariant PCA with application to molecular data

Michael Fraiman, Paulina Hoyos, Tamir Bendory et al.

Principal component analysis (PCA) is a fundamental technique for dimensionality reduction and denoising; however, its application to three-dimensional data with arbitrary orientations -- common in structural biology -- presents significant challenges. A naive approach requires augmenting the dataset with many rotated copies of each sample, incurring prohibitive computational costs. In this paper, we extend PCA to 3D volumetric datasets with unknown orientations by developing an efficient and principled framework for SO(3)-invariant PCA that implicitly accounts for all rotations without explicit data augmentation. By exploiting underlying algebraic structure, we demonstrate that the computation involves only the square root of the total number of covariance entries, resulting in a substantial reduction in complexity. We validate the method on real-world molecular datasets, demonstrating its effectiveness and opening up new possibilities for large-scale, high-dimensional reconstruction problems.

1.2QUANT-PHApr 23, 2021

An integer factorization algorithm which uses diffusion as a computational engine

Carlos A. Cadavid, Paulina Hoyos, Jay Jorgenson et al.

In this article we develop an algorithm which computes a divisor of an integer $N$, which is assumed to be neither prime nor the power of a prime. The algorithm uses discrete time heat diffusion on a finite graph. If $N$ has $m$ distinct prime factors, then the probability that our algorithm runs successfully is at least $p(m) = 1-(m+1)/2^{m}$. We compute the computational complexity of the algorithm in terms of classical, or digital, steps and in terms of diffusion steps, which is a concept that we define here. As we will discuss below, we assert that a diffusion step can and should be considered as being comparable to a quantum step for an algorithm which runs on a quantum computer. With this, we prove that our factorization algorithm uses at most $O((\log N)^{2})$ deterministic steps and at most $O((\log N)^{2})$ diffusion steps with an implied constant which is effective. By comparison, Shor's algorithm is known to use at most $O((\log N)^{2}\log (\log N) \log (\log \log N))$ quantum steps on a quantum computer. As an example of our algorithm, we simulate the diffusion computer algorithm on a desktop computer and obtain factorizations of $N=33$ and $N=1363$.