Machine learning for option pricing: an empirical investigation of network architectures
This work addresses the problem of optimizing neural network architectures for financial modeling, offering incremental improvements in accuracy and efficiency for practitioners in quantitative finance.
The authors investigated how different neural network architectures affect accuracy and training time for option pricing and implied volatility prediction, finding that generalized highway networks performed best for Black-Scholes and Heston models, while a simplified DGM variant excelled for implied volatility.
We consider the supervised learning problem of learning the price of an option or the implied volatility given appropriate input data (model parameters) and corresponding output data (option prices or implied volatilities). The majority of articles in this literature considers a (plain) feed forward neural network architecture in order to connect the neurons used for learning the function mapping inputs to outputs. In this article, motivated by methods in image classification and recent advances in machine learning methods for PDEs, we investigate empirically whether and how the choice of network architecture affects the accuracy and training time of a machine learning algorithm. We find that the generalized highway network architecture achieves the best performance, when considering the mean squared error and the training time as criteria, within the considered parameter budgets for the Black-Scholes and Heston option pricing problems. Considering the transformed implied volatility problem, a simplified DGM variant achieves the lowest error among the tested architectures. We also carry out a capacity-normalised comparison for completeness, where all architectures are evaluated with an equal number of parameters. Finally, for the implied volatility problem, we additionally include experiments using real market data.