Autoencoding any Data through Kernel Autoencoders
This provides a novel kernel-based method for data representation, potentially useful for domains like graph data, but it is incremental as it builds on existing autoencoder and kernel concepts.
The paper introduces Kernel Autoencoders (KAE) to represent data in Hilbert spaces, extending autoencoding to infinite dimensions and any data type via RKHS, with theoretical analysis and empirical validation on simulated data and molecules.
This paper investigates a novel algorithmic approach to data representation based on kernel methods. Assuming that the observations lie in a Hilbert space X, the introduced Kernel Autoencoder (KAE) is the composition of mappings from vector-valued Reproducing Kernel Hilbert Spaces (vv-RKHSs) that minimizes the expected reconstruction error. Beyond a first extension of the autoencoding scheme to possibly infinite dimensional Hilbert spaces, KAE further allows to autoencode any kind of data by choosing X to be itself a RKHS. A theoretical analysis of the model is carried out, providing a generalization bound, and shedding light on its connection with Kernel Principal Component Analysis. The proposed algorithms are then detailed at length: they crucially rely on the form taken by the minimizers, revealed by a dedicated Representer Theorem. Finally, numerical experiments on both simulated data and real labeled graphs (molecules) provide empirical evidence of the KAE performances.