LGAIMLNov 2, 2018

Dantzig Selector with an Approximately Optimal Denoising Matrix and its Application to Reinforcement Learning

arXiv:1811.00958v11 citations
Originality Incremental advance
AI Analysis

This work addresses a specific bottleneck in sparse learning and reinforcement learning, offering an incremental improvement for practitioners in these fields.

The authors tackled the suboptimality of the data matrix as a denoising matrix in the Dantzig Selector, proposing the Optimal Denoising Dantzig Selector (ODDS) algorithm to approximate an optimal denoising matrix, and extended it to reinforcement learning, where it outperformed the conventional DS-TD algorithm in empirical experiments.

Dantzig Selector (DS) is widely used in compressed sensing and sparse learning for feature selection and sparse signal recovery. Since the DS formulation is essentially a linear programming optimization, many existing linear programming solvers can be simply applied for scaling up. The DS formulation can be explained as a basis pursuit denoising problem, wherein the data matrix (or measurement matrix) is employed as the denoising matrix to eliminate the observation noise. However, we notice that the data matrix may not be the optimal denoising matrix, as shown by a simple counter-example. This motivates us to pursue a better denoising matrix for defining a general DS formulation. We first define the optimal denoising matrix through a minimax optimization, which turns out to be an NPhard problem. To make the problem computationally tractable, we propose a novel algorithm, termed as Optimal Denoising Dantzig Selector (ODDS), to approximately estimate the optimal denoising matrix. Empirical experiments validate the proposed method. Finally, a novel sparse reinforcement learning algorithm is formulated by extending the proposed ODDS algorithm to temporal difference learning, and empirical experimental results demonstrate to outperform the conventional vanilla DS-TD algorithm.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes