Rethinking Reinforcement Learning based Logic Synthesis
This addresses logic synthesis optimization for chip design, with practical industrial applications, though it appears incremental over prior RL approaches.
The paper tackled the problem of reinforcement learning-based logic synthesis being insensitive to circuit features, resulting in permutation-invariant operator sequences. The authors developed a new RL method that automatically recognizes critical operators and generates generalizable sequences, achieving a good balance among delay, area, and runtime on benchmarks including industrial-scale circuits.
Recently, reinforcement learning has been used to address logic synthesis by formulating the operator sequence optimization problem as a Markov decision process. However, through extensive experiments, we find out that the learned policy makes decisions independent from the circuit features (i.e., states) and yields an operator sequence that is permutation invariant to some extent in terms of operators. Based on these findings, we develop a new RL-based method that can automatically recognize critical operators and generate common operator sequences generalizable to unseen circuits. Our algorithm is verified on both the EPFL benchmark, a private dataset and a circuit at industrial scale. Experimental results demonstrate that it achieves a good balance among delay, area and runtime, and is practical for industrial usage.