The Entrapment Problem in Random Walk Decentralized Learning
This addresses convergence issues in decentralized learning for distributed data systems, but it is incremental as it builds on existing MH-based methods.
The paper tackles the entrapment problem in decentralized SGD using random walks, where nodes can get stuck and slow convergence, by proposing the MHLJ algorithm with random jumps to overcome this, achieving improved convergence rates validated in experiments.
This paper explores decentralized learning in a graph-based setting, where data is distributed across nodes. We investigate a decentralized SGD algorithm that utilizes a random walk to update a global model based on local data. Our focus is on designing the transition probability matrix to speed up convergence. While importance sampling can enhance centralized learning, its decentralized counterpart, using the Metropolis-Hastings (MH) algorithm, can lead to the entrapment problem, where the random walk becomes stuck at certain nodes, slowing convergence. To address this, we propose the Metropolis-Hastings with Lévy Jumps (MHLJ) algorithm, which incorporates random perturbations (jumps) to overcome entrapment. We theoretically establish the convergence rate and error gap of MHLJ and validate our findings through numerical experiments.