Hierarchical Deep Reinforcement Learning Framework for Multi-Year Asset Management Under Budget Constraints
This addresses the challenge of scalable and cost-effective infrastructure maintenance planning for asset managers, though it is incremental as it builds on existing hierarchical and reinforcement learning methods.
The paper tackled the problem of multi-year infrastructure asset management under budget constraints by proposing a Hierarchical Deep Reinforcement Learning framework, which decomposed planning into budget allocation and maintenance prioritization, resulting in faster convergence, effective scaling, and near-optimal solutions for sewer networks of 10 to 20 sewersheds compared to Deep Q-Learning and genetic algorithms.
Budget planning and maintenance optimization are crucial for infrastructure asset management, ensuring cost-effectiveness and sustainability. However, the complexity arising from combinatorial action spaces, diverse asset deterioration, stringent budget constraints, and environmental uncertainty significantly limits existing methods' scalability. This paper proposes a Hierarchical Deep Reinforcement Learning methodology specifically tailored to multi-year infrastructure planning. Our approach decomposes the problem into two hierarchical levels: a high-level Budget Planner allocating annual budgets within explicit feasibility bounds, and a low-level Maintenance Planner prioritizing assets within the allocated budget. By structurally separating macro-budget decisions from asset-level prioritization and integrating linear programming projection within a hierarchical Soft Actor-Critic framework, the method efficiently addresses exponential growth in the action space and ensures rigorous budget compliance. A case study evaluating sewer networks of varying sizes (10, 15, and 20 sewersheds) illustrates the effectiveness of the proposed approach. Compared to conventional Deep Q-Learning and enhanced genetic algorithms, our methodology converges more rapidly, scales effectively, and consistently delivers near-optimal solutions even as network size grows.