Liana Fong

2papers

2 Papers

DCAug 11, 2018Code
Matrix Factorization on GPUs with Memory Optimization and Approximate Computing

Wei Tan, Shiyu Chang, Liana Fong et al.

Matrix factorization (MF) discovers latent features from observations, which has shown great promises in the fields of collaborative filtering, data compression, feature extraction, word embedding, etc. While many problem-specific optimization techniques have been proposed, alternating least square (ALS) remains popular due to its general applicability e.g. easy to handle positive-unlabeled inputs, fast convergence and parallelization capability. Current MF implementations are either optimized for a single machine or with a need of a large computer cluster but still are insufficient. This is because a single machine provides limited compute power for large-scale data while multiple machines suffer from the network communication bottleneck. To address the aforementioned challenge, accelerating ALS on graphics processing units (GPUs) is a promising direction. We propose the novel approach in enhancing the MF efficiency via both memory optimization and approximate computing. The former exploits GPU memory hierarchy to increase data reuse, while the later reduces unnecessary computing without hurting the convergence of learning algorithms. Extensive experiments on large-scale datasets show that our solution not only outperforms the competing CPU solutions by a large margin but also has a 2x-4x performance gain compared to the state-of-the-art GPU solutions. Our implementations are open-sourced and publicly available.

DSJul 31, 2019
"Sliced" Subwindow Search: a Sublinear-complexity Solution to the Maximum Rectangle Problem

Max Reuter, Gheorghe-Teodor Bercea, Liana Fong

Considering a 2D matrix of positive and negative numbers, how might one draw a rectangle within it whose contents sum higher than all other rectangles'? This fundamental problem, commonly known the maximum rectangle problem or subwindow search, spans many computational domains. Yet, the problem has not been solved without demanding computational resources at least linearly proportional to the size of the matrix. In this work, we present a new approach to the problem which achieves sublinear time and memory complexities by interpolating between a small amount of equidistant sections of the matrix. Applied to natural images, our solution outperforms the state-of-the-art by achieving an 11x increase in speed and memory efficiency at 99% comparative accuracy. In general, our solution outperforms existing solutions when matrices are sufficiently large and a marginal decrease in accuracy is acceptable, such as in many problems involving natural images. As such, it is well-suited for real-time application and in a variety of computationally hard instances of the maximum rectangle problem.