LG OCMar 31, 2021

CDiNN -Convex Difference Neural Networks

Parameswaran Sankaranarayanan, Raghunathan Rengaswamy

arXiv:2103.17231v25.517 citations

Originality Incremental advance

AI Analysis

This work addresses challenges in neural network-based optimal control for applications requiring efficient and reliable optimization, though it appears incremental as it builds on ICNNs.

The paper tackles the problem of using neural networks for optimal control by addressing the limitations of Input Convex Neural Networks (ICNNs), which can lead to high approximation errors and fail to capture simple dynamic structures like linear time delay systems. It introduces CDiNN, a new architecture that learns functions as differences of polyhedral convex functions, enabling efficient convex optimization with convergence guarantees and reducing each iteration to a linear programming problem.

Neural networks with ReLU activation function have been shown to be universal function approximators and learn function mapping as non-smooth functions. Recently, there is considerable interest in the use of neural networks in applications such as optimal control. It is well-known that optimization involving non-convex, non-smooth functions are computationally intensive and have limited convergence guarantees. Moreover, the choice of optimization hyper-parameters used in gradient descent/ascent significantly affect the quality of the obtained solutions. A new neural network architecture called the Input Convex Neural Networks (ICNNs) learn the output as a convex function of inputs thereby allowing the use of efficient convex optimization methods. Use of ICNNs for determining the input for minimizing output has two major problems: learning of a non-convex function as a convex mapping could result in significant function approximation error, and we also note that the existing representations cannot capture simple dynamic structures like linear time delay systems. We attempt to address the above problems by introduction of a new neural network architecture, which we call the CDiNN, which learns the function as a difference of polyhedral convex functions from data. We also discuss that, in some cases, the optimal input can be obtained from CDiNN through difference of convex optimization with convergence guarantees and that at each iteration, the problem is reduced to a linear programming problem.

View on arXiv PDF

Similar