1.9DCMar 31
A Practical GPU-Accelerated Implementation of Orthogonal Matching PursuitAriel Lubonja, Sebastian Kazmarek Praesius, Trac Duy Tran
Finding the sparsest solution to the underdetermined system $\mathbf{y}=\mathbf{Ax}$, given a tolerance, is known to be NP-hard. Many approximate solutions to this problem exist, and Orthogonal Matching Pursuit (OMP) is one of the most widely used. However, existing OMP implementations don't take full advantage of matrix properties or modern CPU and GPU-based Linear Algebra kernels. For this paper, we implemented an efficient implementation of OMP that leverages Cholesky inverse properties as well as the power of GPUs to deliver up to \textbf{310x speedup over Scikit-Learn} and \textbf{26x over SPAMS}. The package is published on PyPI (\texttt{pip install batched-omp}) and is fully scikit-learn compatible.
LGNov 13, 2025
Semi-Unified Sparse Dictionary Learning with Learnable Top-K LISTA and FISTA EncodersFengsheng Lin, Shengyi Yan, Trac Duy Tran
We present a semi-unified sparse dictionary learning framework that bridges the gap between classical sparse models and modern deep architectures. Specifically, the method integrates strict Top-$K$ LISTA and its convex FISTA-based variant (LISTAConv) into the discriminative LC-KSVD2 model, enabling co-evolution between the sparse encoder and the dictionary under supervised or unsupervised regimes. This unified design retains the interpretability of traditional sparse coding while benefiting from efficient, differentiable training. We further establish a PALM-style convergence analysis for the convex variant, ensuring theoretical stability under block alternation. Experimentally, our method achieves 95.6\% on CIFAR-10, 86.3\% on CIFAR-100, and 88.5\% on TinyImageNet with faster convergence and lower memory cost ($<$4GB GPU). The results confirm that the proposed LC-KSVD2 + LISTA/LISTAConv pipeline offers an interpretable and computationally efficient alternative for modern deep architectures.