TraDiffusion: Trajectory-Based Training-Free Image Generation
This work addresses the need for more intuitive and precise control in image generation for users, though it appears incremental as it builds on existing diffusion models.
The paper tackles the problem of controllable text-to-image generation by introducing TraDiffusion, a training-free method that uses mouse trajectories to guide image generation, achieving simpler and more natural control as demonstrated on the COCO dataset.
In this work, we propose a training-free, trajectory-based controllable T2I approach, termed TraDiffusion. This novel method allows users to effortlessly guide image generation via mouse trajectories. To achieve precise control, we design a distance awareness energy function to effectively guide latent variables, ensuring that the focus of generation is within the areas defined by the trajectory. The energy function encompasses a control function to draw the generation closer to the specified trajectory and a movement function to diminish activity in areas distant from the trajectory. Through extensive experiments and qualitative assessments on the COCO dataset, the results reveal that TraDiffusion facilitates simpler, more natural image control. Moreover, it showcases the ability to manipulate salient regions, attributes, and relationships within the generated images, alongside visual input based on arbitrary or enhanced trajectories.