On the generalization of Tanimoto-type kernels to real valued functions
This work provides a more flexible similarity measure for real-valued data, but it is incremental as it builds on existing Tanimoto kernel extensions.
The paper tackles the problem of generalizing the Tanimoto kernel from binary or nonnegative attributes to arbitrary real-valued functions, resulting in a new formulation that includes explicit feature representation and smooth approximations.
The Tanimoto kernel (Jaccard index) is a well known tool to describe the similarity between sets of binary attributes. It has been extended to the case when the attributes are nonnegative real values. This paper introduces a more general Tanimoto kernel formulation which allows to measure the similarity of arbitrary real-valued functions. This extension is constructed by unifying the representation of the attributes via properly chosen sets. After deriving the general form of the kernel, explicit feature representation is extracted from the kernel function, and a simply way of including general kernels into the Tanimoto kernel is shown. Finally, the kernel is also expressed as a quotient of piecewise linear functions, and a smooth approximation is provided.