Generalized Intersection Kernel
This work addresses kernel methods in machine learning for computer vision and data analysis, offering incremental improvements by extending existing kernels to broader data types.
The study introduced the generalized intersection (GInt) kernel and normalized generalized min-max (NGMM) kernel, generalizing the histogram intersection kernel to handle data with negative and positive entries, and showed through classification on 40 UCI datasets that GInt performs well without tuning, while NGMM typically outperforms it.
Following the very recent line of work on the ``generalized min-max'' (GMM) kernel, this study proposes the ``generalized intersection'' (GInt) kernel and the related ``normalized generalized min-max'' (NGMM) kernel. In computer vision, the (histogram) intersection kernel has been popular, and the GInt kernel generalizes it to data which can have both negative and positive entries. Through an extensive empirical classification study on 40 datasets from the UCI repository, we are able to show that this (tuning-free) GInt kernel performs fairly well. The empirical results also demonstrate that the NGMM kernel typically outperforms the GInt kernel. Interestingly, the NGMM kernel has another interpretation --- it is the ``asymmetrically transformed'' version of the GInt kernel, based on the idea of ``asymmetric hashing''. Just like the GMM kernel, the NGMM kernel can be efficiently linearized through (e.g.,) generalized consistent weighted sampling (GCWS), as empirically validated in our study. Owing to the discrete nature of hashed values, it also provides a scheme for approximate near neighbor search.