LG DCAug 29, 2024

High-Dimensional Sparse Data Low-rank Representation via Accelerated Asynchronous Parallel Stochastic Gradient Descent

arXiv:2408.16592v12.6h-index: 3

Originality Incremental advance

AI Analysis

This work addresses a domain-specific problem for researchers and practitioners dealing with large-scale, high-dimensional sparse datasets, offering incremental improvements in optimization efficiency.

The paper tackled the computational inefficiency and slow convergence of low-rank representation models for high-dimensional sparse data by proposing an accelerated asynchronous parallel stochastic gradient descent method, which outperformed existing algorithms in accuracy and training time.

Data characterized by high dimensionality and sparsity are commonly used to describe real-world node interactions. Low-rank representation (LR) can map high-dimensional sparse (HDS) data to low-dimensional feature spaces and infer node interactions via modeling data latent associations. Unfortunately, existing optimization algorithms for LR models are computationally inefficient and slowly convergent on large-scale datasets. To address this issue, this paper proposes an Accelerated Asynchronous Parallel Stochastic Gradient Descent A2PSGD for High-Dimensional Sparse Data Low-rank Representation with three fold-ideas: a) establishing a lock-free scheduler to simultaneously respond to scheduling requests from multiple threads; b) introducing a greedy algorithm-based load balancing strategy for balancing the computational load among threads; c) incorporating Nesterov's accelerated gradient into the learning scheme to accelerate model convergence. Empirical studies show that A2PSGD outperforms existing optimization algorithms for HDS data LR in both accuracy and training time.

View on arXiv PDF

Similar