LGSIAug 23, 2021

Influence-guided Data Augmentation for Neural Tensor Completion

arXiv:2108.10248v118 citations
Originality Incremental advance
AI Analysis

This work addresses overfitting in sparse tensor completion for applications like recommendation systems, but it is incremental as it builds on existing neural methods with a novel augmentation technique.

The paper tackles the problem of inaccurate missing value prediction in sparse multi-dimensional data (tensor completion) by proposing DAIN, a data augmentation framework that uses influence functions to guide sampling, resulting in improved imputation accuracy on four real-world tensors.

How can we predict missing values in multi-dimensional data (or tensors) more accurately? The task of tensor completion is crucial in many applications such as personalized recommendation, image and video restoration, and link prediction in social networks. Many tensor factorization and neural network-based tensor completion algorithms have been developed to predict missing entries in partially observed tensors. However, they can produce inaccurate estimations as real-world tensors are very sparse, and these methods tend to overfit on the small amount of data. Here, we overcome these shortcomings by presenting a data augmentation technique for tensors. In this paper, we propose DAIN, a general data augmentation framework that enhances the prediction accuracy of neural tensor completion methods. Specifically, DAIN first trains a neural model and finds tensor cell importances with influence functions. After that, DAIN aggregates the cell importance to calculate the importance of each entity (i.e., an index of a dimension). Finally, DAIN augments the tensor by weighted sampling of entity importances and a value predictor. Extensive experimental results show that DAIN outperforms all data augmentation baselines in terms of enhancing imputation accuracy of neural tensor completion on four diverse real-world tensors. Ablation studies of DAIN substantiate the effectiveness of each component of DAIN. Furthermore, we show that DAIN scales near linearly to large datasets.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes