Wei Wu

h-index6

3papers

136citations

Novelty53%

AI Score43

Ranked #54,308 of 194,257 authors (top 28%)#18,867 in CV (top 32%)

3 Papers

29.2CVMar 25, 2024Code

DreamLIP: Language-Image Pre-training with Long Captions

Kecheng Zheng, Yifei Zhang, Wei Wu et al.

Language-image pre-training largely relies on how precisely and thoroughly a text describes its paired image. In practice, however, the contents of an image can be so rich that well describing them requires lengthy captions (e.g., with 10 sentences), which are usually missing in existing datasets. Consequently, there are currently no clear evidences on whether and how language-image pre-training could benefit from long captions. To figure this out, we first re-caption 30M images with detailed descriptions using a pre-trained Multi-modality Large Language Model (MLLM), and then study the usage of the resulting captions under a contrastive learning framework. We observe that, each sentence within a long caption is very likely to describe the image partially (e.g., an object). Motivated by this, we propose to dynamically sample sub-captions from the text label to construct multiple positive pairs, and introduce a grouping loss to match the embeddings of each sub-caption with its corresponding local image patches in a self-supervised manner. Experimental results on a wide rage of downstream tasks demonstrate the consistent superiority of our method, termed DreamLIP, over previous alternatives, highlighting its fine-grained representational capacity. It is noteworthy that, on the tasks of image-text retrieval and semantic segmentation, our model trained with 30M image-text pairs achieves on par or even better performance than CLIP trained with 400M pairs. Project page is available at https://zyf0619sjtu.github.io/dream-lip.

12.8CVApr 18, 2024Code

FreeDiff: Progressive Frequency Truncation for Image Editing with Diffusion Models

Wei Wu, Qingnan Fan, Shuai Qin et al.

Precise image editing with text-to-image models has attracted increasing interest due to their remarkable generative capabilities and user-friendly nature. However, such attempts face the pivotal challenge of misalignment between the intended precise editing target regions and the broader area impacted by the guidance in practice. Despite excellent methods leveraging attention mechanisms that have been developed to refine the editing guidance, these approaches necessitate modifications through complex network architecture and are limited to specific editing tasks. In this work, we re-examine the diffusion process and misalignment problem from a frequency perspective, revealing that, due to the power law of natural images and the decaying noise schedule, the denoising network primarily recovers low-frequency image components during the earlier timesteps and thus brings excessive low-frequency signals for editing. Leveraging this insight, we introduce a novel fine-tuning free approach that employs progressive $\textbf{Fre}$qu$\textbf{e}$ncy truncation to refine the guidance of $\textbf{Diff}$usion models for universal editing tasks ($\textbf{FreeDiff}$). Our method achieves comparable results with state-of-the-art methods across a variety of editing tasks and on a diverse set of images, highlighting its potential as a versatile tool in image editing applications.

9.1CRDec 25, 2017

A Provably Secure Ring Signature Scheme in Certificateless Cryptography

Lei Zhang, Futai Zhang, Wei Wu

Ring signature is a kind of group-oriented signature. It allows a member of a group to sign messages on behalf of the group without revealing his/her identity. Certificateless public key cryptography was first introduced by Al-Riyami and Paterson in Asiacrypt 2003. In certificateless cryptography, it does not require the use of certificates to guarantee the authenticity of users' public keys. Meanwhile, certificateless cryptography does not have the key escrow problem, which seems to be inherent in the Identity-based cryptography. In this paper, we propose a concrete certificateless ring signature scheme. The security models of certificateless ring signature are also formalized. Our new scheme is provably secure in the random oracle model, with the assumption that the Computational Diffie-Hellman problem is hard. In addition, we also show that a generic construction of certificateless ring signature is insecure against the key replacement attack defined in our security models.