Maximum entropy GFlowNets with soft Q-learning
This work provides a theoretical link for researchers in generative modeling and reinforcement learning, though it appears incremental as it builds on existing GFN and maximum entropy RL concepts.
The paper tackled the unclear connection between Generative Flow Networks (GFNs) and maximum entropy reinforcement learning by constructing a reward function to establish an exact relationship, resulting in maximum entropy GFNs that achieve the maximum entropy attainable without constraints on the state space.
Generative Flow Networks (GFNs) have emerged as a powerful tool for sampling discrete objects from unnormalized distributions, offering a scalable alternative to Markov Chain Monte Carlo (MCMC) methods. While GFNs draw inspiration from maximum entropy reinforcement learning (RL), the connection between the two has largely been unclear and seemingly applicable only in specific cases. This paper addresses the connection by constructing an appropriate reward function, thereby establishing an exact relationship between GFNs and maximum entropy RL. This construction allows us to introduce maximum entropy GFNs, which, in contrast to GFNs with uniform backward policy, achieve the maximum entropy attainable by GFNs without constraints on the state space.