LGIRMLOct 28, 2016

Toward Implicit Sample Noise Modeling: Deviation-driven Matrix Factorization

arXiv:1610.09274v11 citations
Originality Incremental advance
AI Analysis

This addresses noise handling in matrix factorization for applications like recommendation systems and sensor data, but it is incremental as it builds on existing weighting frameworks.

The paper tackles the problem of diverse implicit noise in data for matrix factorization by proposing a model that learns deviations and dynamically reweights instances, resulting in improved accuracy and efficiency over state-of-the-art methods.

The objective function of a matrix factorization model usually aims to minimize the average of a regression error contributed by each element. However, given the existence of stochastic noises, the implicit deviations of sample data from their true values are almost surely diverse, which makes each data point not equally suitable for fitting a model. In this case, simply averaging the cost among data in the objective function is not ideal. Intuitively we would like to emphasize more on the reliable instances (i.e., those contain smaller noise) while training a model. Motivated by such observation, we derive our formula from a theoretical framework for optimal weighting under heteroscedastic noise distribution. Specifically, by modeling and learning the deviation of data, we design a novel matrix factorization model. Our model has two advantages. First, it jointly learns the deviation and conducts dynamic reweighting of instances, allowing the model to converge to a better solution. Second, during learning the deviated instances are assigned lower weights, which leads to faster convergence since the model does not need to overfit the noise. The experiments are conducted in clean recommendation and noisy sensor datasets to test the effectiveness of the model in various scenarios. The results show that our model outperforms the state-of-the-art factorization and deep learning models in both accuracy and efficiency.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes