CV AIFeb 22, 2022

RuCLIP -- new models and experiments: a technical report

Alex Shonenkov, Andrey Kuznetsov, Denis Dimitrov, Tatyana Shavrina, Daniil Chesakov, Anastasia Maltseva, Alena Fenogenova, Igor Pavlov, Anton Emelyanov, Sergey Markov, Daria Bakshandaeva, Vera Shybaeva

arXiv:2202.10784v12.64 citations

Originality Synthesis-oriented

AI Analysis

This work provides improved models for Russian-language vision-language tasks, but it is incremental as it builds on existing CLIP architectures.

The authors introduced six new implementations of the ruCLIP model trained on 240M Russian-English pairs, which outperformed the baseline CLIP + OPUS-MT translation on most of 16 datasets in few-shot and zero-shot tasks.

In the report we propose six new implementations of ruCLIP model trained on our 240M pairs. The accuracy results are compared with original CLIP model with Ru-En translation (OPUS-MT) on 16 datasets from different domains. Our best implementations outperform CLIP + OPUS-MT solution on most of the datasets in few-show and zero-shot tasks. In the report we briefly describe the implementations and concentrate on the conducted experiments. Inference execution time comparison is also presented in the report.

View on arXiv PDF

Similar