Oscar López

4papers

9citations

Novelty56%

AI Score27

Ranked #160,604 of 201,326 authors (top 80%)#498 in DS (top 89%)

4 Papers

MLJun 9, 2023

Spectral gap-based deterministic tensor completion

Kameron Decker Harris, Oscar López, Angus Read et al.

Tensor completion is a core machine learning algorithm used in recommender systems and other domains with missing data. While the matrix case is well-understood, theoretical results for tensor problems are limited, particularly when the sampling patterns are deterministic. Here we bound the generalization error of the solutions of two tensor completion methods, Poisson loss and atomic norm minimization, providing tighter bounds in terms of the target tensor rank. If the ground-truth tensor is order $t$ with CP-rank $r$, the dependence on $r$ is improved from $r^{2(t-1)(t^2-t-1)}$ in arXiv:1910.10692 to $r^{2(t-1)(3t-5)}$. The error in our bounds is deterministically controlled by the spectral gap of the sampling sparsity pattern. We also prove several new properties for the atomic tensor norm, reducing the rank dependence from $r^{3t-3}$ in arXiv:1711.04965 to $r^{3t-5}$ under random sampling schemes. A limitation is that atomic norm minimization, while theoretically interesting, leads to inefficient algorithms. However, numerical experiments illustrate the dependence of the reconstruction error on the spectral gap for the practical max-quasinorm, ridge penalty, and Poisson loss minimization algorithms. This view through the spectral gap is a promising window for further study of tensor algorithms.

DSAug 10, 2024

Simple and Nearly-Optimal Sampling for Rank-1 Tensor Completion via Gauss-Jordan

Alejandro Gomez-Leos, Oscar López

We revisit the sample and computational complexity of completing a rank-1 tensor in $\otimes_{i=1}^{N} \mathbb{R}^{d}$, given a uniformly sampled subset of its entries. We present a characterization of the problem (i.e. nonzero entries) which admits an algorithm amounting to Gauss-Jordan on a pair of random linear systems. For example, when $N = Θ(1)$, we prove it uses no more than $m = O(d^2 \log d)$ samples and runs in $O(md^2)$ time. Moreover, we show any algorithm requires $Ω(d\log d)$ samples. By contrast, existing upper bounds on the sample complexity are at least as large as $d^{1.5} μ^{Ω(1)} \log^{Ω(1)} d$, where $μ$ can be $Θ(d)$ in the worst case. Prior work obtained these looser guarantees in higher rank versions of our problem, and tend to involve more complicated algorithms.

MEJan 25, 2022

Zero-Truncated Poisson Regression for Sparse Multiway Count Data Corrupted by False Zeros

Oscar López, Daniel M. Dunlavy, Richard B. Lehoucq

We propose a novel statistical inference methodology for multiway count data that is corrupted by false zeros that are indistinguishable from true zero counts. Our approach consists of zero-truncating the Poisson distribution to neglect all zero values. This simple truncated approach dispenses with the need to distinguish between true and false zero counts and reduces the amount of data to be processed. Inference is accomplished via tensor completion that imposes low-rank tensor structure on the Poisson parameter space. Our main result shows that an $N$-way rank-$R$ parametric tensor $\boldsymbol{\mathscr{M}}\in(0,\infty)^{I\times \cdots\times I}$ generating Poisson observations can be accurately estimated by zero-truncated Poisson regression from approximately $IR^2\log_2^2(I)$ non-zero counts under the nonnegative canonical polyadic decomposition. Our result also quantifies the error made by zero-truncating the Poisson distribution when the parameter is uniformly bounded from below. Therefore, under a low-rank multiparameter model, we propose an implementable approach guaranteed to achieve accurate regression in under-determined scenarios with substantial corruption by false zeros. Several numerical experiments are presented to explore the theoretical results.

OCJul 9, 2016

Beating level-set methods for 3D seismic data interpolation: a primal-dual alternating approach

Rajiv Kumar, Oscar López, Damek Davis et al.

Acquisition cost is a crucial bottleneck for seismic workflows, and low-rank formulations for data interpolation allow practitioners to `fill in' data volumes from critically subsampled data acquired in the field. Tremendous size of seismic data volumes required for seismic processing remains a major challenge for these techniques. We propose a new approach to solve residual constrained formulations for interpolation. We represent the data volume using matrix factors, and build a block-coordinate algorithm with constrained convex subproblems that are solved with a primal-dual splitting scheme. The new approach is competitive with state of the art level-set algorithms that interchange the role of objectives with constraints. We use the new algorithm to successfully interpolate a large scale 5D seismic data volume, generated from the geologically complex synthetic 3D Compass velocity model, where 80% of the data has been removed.