Deriving Neural Architectures from Sequence and Graph Kernels
This work addresses the challenge of formalizing neural architecture design for structured data, which is important for researchers and practitioners in machine learning, though it appears incremental as it builds on existing kernel methods.
The authors tackled the problem of designing neural architectures for structured objects like sequences and graphs by deriving neural operations from combinatorial kernels, achieving state-of-the-art results in language modeling and molecular graph regression.
The design of neural architectures for structured objects is typically guided by experimental insights rather than a formal process. In this work, we appeal to kernels over combinatorial structures, such as sequences and graphs, to derive appropriate neural operations. We introduce a class of deep recurrent neural operations and formally characterize their associated kernel spaces. Our recurrent modules compare the input to virtual reference objects (cf. filters in CNN) via the kernels. Similar to traditional neural operations, these reference objects are parameterized and directly optimized in end-to-end training. We empirically evaluate the proposed class of neural architectures on standard applications such as language modeling and molecular graph regression, achieving state-of-the-art results across these applications.