LGAug 19, 2016

Operator-Valued Bochner Theorem, Fourier Feature Maps for Operator-Valued Kernels, and Vector-Valued Learning

arXiv:1608.05639v123 citations
Originality Incremental advance
AI Analysis

This work addresses a foundational challenge in machine learning for vector-valued functions, offering a theoretical framework with potential applications in multi-output learning, though it is incremental as it extends existing scalar methods to the operator-valued setting.

The paper tackles the problem of constructing random Fourier feature maps for operator-valued kernels, generalizing the scalar case to vector-valued learning, and provides a closed-form formula for the required probability measure, with uniform convergence under the Hilbert-Schmidt norm on compact subsets requiring only differentiable kernels, an improvement over previous twice-differentiability requirements.

This paper presents a framework for computing random operator-valued feature maps for operator-valued positive definite kernels. This is a generalization of the random Fourier features for scalar-valued kernels to the operator-valued case. Our general setting is that of operator-valued kernels corresponding to RKHS of functions with values in a Hilbert space. We show that in general, for a given kernel, there are potentially infinitely many random feature maps, which can be bounded or unbounded. Most importantly, given a kernel, we present a general, closed form formula for computing a corresponding probability measure, which is required for the construction of the Fourier features, and which, unlike the scalar case, is not uniquely and automatically determined by the kernel. We also show that, under appropriate conditions, random bounded feature maps can always be computed. Furthermore, we show the uniform convergence, under the Hilbert-Schmidt norm, of the resulting approximate kernel to the exact kernel on any compact subset of Euclidean space. Our convergence requires differentiable kernels, an improvement over the twice-differentiability requirement in previous work in the scalar setting. We then show how operator-valued feature maps and their approximations can be employed in a general vector-valued learning framework. The mathematical formulation is illustrated by numerical examples on matrix-valued kernels.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes