CV AI LGDec 24, 2024

1.58-bit FLUX

Chenglin Yang, Celong Liu, Xueqing Deng, Dongwon Kim, Xing Mei, Xiaohui Shen, Liang-Chieh Chen

arXiv:2412.18653v123.535 citationsh-index: 66

Originality Incremental advance

AI Analysis

This enables more efficient deployment of high-quality text-to-image models, though it is incremental as it builds on existing quantization methods applied to a new model.

The paper tackles the problem of quantizing the state-of-the-art text-to-image model FLUX.1-dev to 1.58-bit weights, achieving comparable performance for 1024x1024 image generation while reducing model storage by 7.7x and inference memory by 5.1x.

We present 1.58-bit FLUX, the first successful approach to quantizing the state-of-the-art text-to-image generation model, FLUX.1-dev, using 1.58-bit weights (i.e., values in {-1, 0, +1}) while maintaining comparable performance for generating 1024 x 1024 images. Notably, our quantization method operates without access to image data, relying solely on self-supervision from the FLUX.1-dev model. Additionally, we develop a custom kernel optimized for 1.58-bit operations, achieving a 7.7x reduction in model storage, a 5.1x reduction in inference memory, and improved inference latency. Extensive evaluations on the GenEval and T2I Compbench benchmarks demonstrate the effectiveness of 1.58-bit FLUX in maintaining generation quality while significantly enhancing computational efficiency.

View on arXiv PDF

Similar