Reinforcement Learning for Graph Coloring: Understanding the Power and Limits of Non-Label Invariant Representations
This work addresses the need for label-invariant representations in machine learning for graph-based problems, which is incremental as it builds on existing methods to identify a specific bottleneck.
The paper tackled the register allocation problem by casting it as a graph coloring task and using a Proximal Policy Optimization model, but found that the model's performance was inconsistent when graphs were relabeled, highlighting a critical limitation.
Register allocation is one of the most important problems for modern compilers. With a practically unlimited number of user variables and a small number of CPU registers, assigning variables to registers without conflicts is a complex task. This work demonstrates the use of casting the register allocation problem as a graph coloring problem. Using technologies such as PyTorch and OpenAI Gymnasium Environments we will show that a Proximal Policy Optimization model can learn to solve the graph coloring problem. We will also show that the labeling of a graph is critical to the performance of the model by taking the matrix representation of a graph and permuting it. We then test the model's effectiveness on each of these permutations and show that it is not effective when given a relabeling of the same graph. Our main contribution lies in showing the need for label reordering invariant representations of graphs for machine learning models to achieve consistent performance.