Neural Stochastic Dual Dynamic Programming
This work addresses scalability limitations for practitioners in process optimization, offering an incremental improvement over existing SDDP methods.
The paper tackled the exponential worst-case complexity of Stochastic Dual Dynamic Programming (SDDP) in multi-stage stochastic optimization by introducing a neural model that learns low-dimensional value functions to accelerate optimization. The result showed that the proposed ν-SDDP significantly reduces solving cost without sacrificing solution quality compared to SDDP and reinforcement learning algorithms on synthetic and real-world problems.
Stochastic dual dynamic programming (SDDP) is a state-of-the-art method for solving multi-stage stochastic optimization, widely used for modeling real-world process optimization tasks. Unfortunately, SDDP has a worst-case complexity that scales exponentially in the number of decision variables, which severely limits applicability to only low dimensional problems. To overcome this limitation, we extend SDDP by introducing a trainable neural model that learns to map problem instances to a piece-wise linear value function within intrinsic low-dimension space, which is architected specifically to interact with a base SDDP solver, so that can accelerate optimization performance on new instances. The proposed Neural Stochastic Dual Dynamic Programming ($ν$-SDDP) continually self-improves by solving successive problems. An empirical investigation demonstrates that $ν$-SDDP can significantly reduce problem solving cost without sacrificing solution quality over competitors such as SDDP and reinforcement learning algorithms, across a range of synthetic and real-world process optimization problems.