Yuan Shen

h-index6

3papers

33citations

Novelty38%

AI Score30

Ranked #137,822 of 194,257 authors (top 71%)#45,379 in CV (top 77%)

3 Papers

8.7CVMay 4, 2021Code

Effectively Leveraging Attributes for Visual Similarity

Samarth Mishra, Zhongping Zhang, Yuan Shen et al.

Measuring similarity between two images often requires performing complex reasoning along different axes (e.g., color, texture, or shape). Insights into what might be important for measuring similarity can can be provided by annotated attributes, but prior work tends to view these annotations as complete, resulting in them using a simplistic approach of predicting attributes on single images, which are, in turn, used to measure similarity. However, it is impractical for a dataset to fully annotate every attribute that may be important. Thus, only representing images based on these incomplete annotations may miss out on key information. To address this issue, we propose the Pairwise Attribute-informed similarity Network (PAN), which breaks similarity learning into capturing similarity conditions and relevance scores from a joint representation of two images. This enables our model to identify that two images contain the same attribute, but can have it deemed irrelevant (e.g., due to fine-grained differences between them) and ignored for measuring similarity between the two images. Notably, while prior methods of using attribute annotations are often unable to outperform prior art, PAN obtains a 4-9% improvement on compatibility prediction between clothing items on Polyvore Outfits, a 5% gain on few shot classification of images using Caltech-UCSD Birds (CUB), and over 1% boost to Recall@1 on In-Shop Clothes Retrieval. Implementation available at https://github.com/samarth4149/PAN

17.0LGJun 7, 2024Code

MMPolymer: A Multimodal Multitask Pretraining Framework for Polymer Property Prediction

Fanmeng Wang, Wentao Guo, Minjie Cheng et al.

Polymers are high-molecular-weight compounds constructed by the covalent bonding of numerous identical or similar monomers so that their 3D structures are complex yet exhibit unignorable regularity. Typically, the properties of a polymer, such as plasticity, conductivity, bio-compatibility, and so on, are highly correlated with its 3D structure. However, existing polymer property prediction methods heavily rely on the information learned from polymer SMILES sequences (P-SMILES strings) while ignoring crucial 3D structural information, resulting in sub-optimal performance. In this work, we propose MMPolymer, a novel multimodal multitask pretraining framework incorporating polymer 1D sequential and 3D structural information to encourage downstream polymer property prediction tasks. Besides, considering the scarcity of polymer 3D data, we further introduce the "Star Substitution" strategy to extract 3D structural information effectively. During pretraining, in addition to predicting masked tokens and recovering clear 3D coordinates, MMPolymer achieves the cross-modal alignment of latent representations. Then we further fine-tune the pretrained MMPolymer for downstream polymer property prediction tasks in the supervised learning paradigm. Experiments show that MMPolymer achieves state-of-the-art performance in downstream property prediction tasks. Moreover, given the pretrained MMPolymer, utilizing merely a single modality in the fine-tuning phase can also outperform existing methods, showcasing the exceptional capability of MMPolymer in polymer feature extraction and utilization.

1.6IRMar 18, 2020

Can AI decrypt fashion jargon for you?

Yuan Shen, Shanduojiao Jiang, Muhammad Rizky Wellyanto et al.

When people talk about fashion, they care about the underlying meaning of fashion concepts,e.g., style.For example, people ask questions like what features make this dress smart.However, the product descriptions in today fashion websites are full of domain specific and low level words. It is not clear to people how exactly those low level descriptions can contribute to a style or any high level fashion concept. In this paper, we proposed a data driven solution to address this concept understanding issues by leveraging a large number of existing product data on fashion sites. We first collected and categorized 1546 fashion keywords into 5 different fashion categories. Then, we collected a new fashion product dataset with 853,056 products in total. Finally, we trained a deep learning model that can explicitly predict and explain high level fashion concepts in a product image with its low level and domain specific fashion features.