LGMar 27, 2020

Kernel Operations on the GPU, with Autodiff, without Memory Overflows

arXiv:2004.11127v2220 citations
AI Analysis

This addresses a critical performance issue for researchers and practitioners in machine learning and geometric applications, offering a significant improvement over existing tools.

The KeOps library tackles the memory bottleneck in GPU-based kernel and distance matrix computations by providing a fast, memory-efficient solution with autodiff support, enabling processing of millions of samples in seconds.

The KeOps library provides a fast and memory-efficient GPU support for tensors whose entries are given by a mathematical formula, such as kernel and distance matrices. KeOps alleviates the major bottleneck of tensor-centric libraries for kernel and geometric applications: memory consumption. It also supports automatic differentiation and outperforms standard GPU baselines, including PyTorch CUDA tensors or the Halide and TVM libraries. KeOps combines optimized C++/CUDA schemes with binders for high-level languages: Python (Numpy and PyTorch), Matlab and GNU R. As a result, high-level "quadratic" codes can now scale up to large data sets with millions of samples processed in seconds. KeOps brings graphics-like performances for kernel methods and is freely available on standard repositories (PyPi, CRAN). To showcase its versatility, we provide tutorials in a wide range of settings online at \url{www.kernel-operations.io}.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes