LGMLDec 19, 2023

Principled Weight Initialisation for Input-Convex Neural Networks

arXiv:2312.12474v116 citationsh-index: 38NIPS
Originality Incremental advance
AI Analysis

This work addresses a specific bottleneck in training ICNNs, which are used in energy-based modeling and optimal transport, by providing a tailored initialization strategy, though it is incremental as it builds on existing signal propagation theory.

The authors tackled the problem of ineffective weight initialization for Input-Convex Neural Networks (ICNNs) due to their non-negative weight constraints, deriving a principled initialization method that accelerates learning and improves generalization, and demonstrated its effectiveness in a drug discovery task for molecular latent space exploration.

Input-Convex Neural Networks (ICNNs) are networks that guarantee convexity in their input-output mapping. These networks have been successfully applied for energy-based modelling, optimal transport problems and learning invariances. The convexity of ICNNs is achieved by using non-decreasing convex activation functions and non-negative weights. Because of these peculiarities, previous initialisation strategies, which implicitly assume centred weights, are not effective for ICNNs. By studying signal propagation through layers with non-negative weights, we are able to derive a principled weight initialisation for ICNNs. Concretely, we generalise signal propagation theory by removing the assumption that weights are sampled from a centred distribution. In a set of experiments, we demonstrate that our principled initialisation effectively accelerates learning in ICNNs and leads to better generalisation. Moreover, we find that, in contrast to common belief, ICNNs can be trained without skip-connections when initialised correctly. Finally, we apply ICNNs to a real-world drug discovery task and show that they allow for more effective molecular latent space exploration.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes