LGMLJan 15, 2024

Efficient Nonparametric Tensor Decomposition for Binary and Count Data

arXiv:2401.07711v17 citationsh-index: 5AAAI
Originality Highly original
AI Analysis

This work addresses the challenge of analyzing discrete data in high-dimensional tensors for applications like recommendation systems or event modeling, offering a more effective method than existing approaches.

The paper tackled the problem of tensor decomposition for binary and count data, which traditional Gaussian-based methods handle poorly, by proposing ENTED, a nonparametric approach using Gaussian processes and variational inference, resulting in better performance and computational advantages on real-world tensor completion tasks.

In numerous applications, binary reactions or event counts are observed and stored within high-order tensors. Tensor decompositions (TDs) serve as a powerful tool to handle such high-dimensional and sparse data. However, many traditional TDs are explicitly or implicitly designed based on the Gaussian distribution, which is unsuitable for discrete data. Moreover, most TDs rely on predefined multi-linear structures, such as CP and Tucker formats. Therefore, they may not be effective enough to handle complex real-world datasets. To address these issues, we propose ENTED, an \underline{E}fficient \underline{N}onparametric \underline{TE}nsor \underline{D}ecomposition for binary and count tensors. Specifically, we first employ a nonparametric Gaussian process (GP) to replace traditional multi-linear structures. Next, we utilize the \pg augmentation which provides a unified framework to establish conjugate models for binary and count distributions. Finally, to address the computational issue of GPs, we enhance the model by incorporating sparse orthogonal variational inference of inducing points, which offers a more effective covariance approximation within GPs and stochastic natural gradient updates for nonparametric models. We evaluate our model on several real-world tensor completion tasks, considering binary and count datasets. The results manifest both better performance and computational advantages of the proposed model.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes