RuCLIP tiny (Russian Contrastive Language–Image Pretraining) is a neural network trained to work with different pairs (images, texts).

Last update: Sep 20, 2022

Related tags

Text Data & NLP ru-clip-tiny

Overview

RuCLIPtiny

Zero-shot image classification model for Russian language

RuCLIP tiny (Russian Contrastive Language–Image Pretraining) is a neural network trained to work with different pairs (images, texts). Our model is based on ConvNeXt-tiny and DistilRuBert-tiny, and is supported by extensive research zero-shot transfer, computer vision, natural language processing, and multimodal learning.

Result evaluation

Our model achieved 46.62% top1 and 73.18% top5 zero-shot accuracy on CIFAR100

Examples

Evaluate & Simple usage

Finetuning

ONNX conversion and speed testing

Model weights

Usage

Install rucliptiny module and requirements first. Use this trick

!gdown -O ru-clip-tiny.pkl https://drive.google.com/uc?id=1-3g3J90pZmHo9jbBzsEmr7ei5zm3VXOL
!pip install git+https://github.com/cene555/ru-clip-tiny.git

Example in 3 steps

Download CLIP image from repo

!wget -c -O CLIP.png https://github.com/openai/CLIP/blob/main/CLIP.png?raw=true

Import libraries

from rucliptiny.predictor import Predictor
from rucliptiny import RuCLIPtiny
import torch

torch.manual_seed(1)
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')

Load model

model = RuCLIPtiny()
model.load_state_dict(torch.load('ru-clip-tiny.pkl'))
model = model.to(device).eval()

Use predictor to get probabilities

predictor = Predictor()

classes = ['диаграмма', 'собака', 'кошка']
text_probs = predictor(model=model, images_path=["CLIP.png"],
                       classes=classes, get_probs=True,
                       max_len=77, device=device)

Cosine similarity Visualization Example

Speed Testing

NVIDIA Tesla K80 (Google Colab session)

TORCH	batch	encode_image	encode_text	total
RuCLIPtiny	2	0.011	0.004	0.015
RuCLIPtiny	8	0.011	0.004	0.015
RuCLIPtiny	16	0.012	0.005	0.017
RuCLIPtiny	32	0.014	0.005	0.019
RuCLIPtiny	64	0.013	0.006	0.019

We would like to express my gratitude to Sber AI for the grants provided, for which research was carried out, as part of the Artificial Intelligence International Junior Contest (AIIJC)

RuCLIP tiny (Russian Contrastive Language–Image Pretraining) is a neural network trained to work with different pairs (images, texts).

Related tags

Overview

RuCLIPtiny

Result evaluation

Examples

Model weights

Usage

Example in 3 steps

Cosine similarity Visualization Example

Speed Testing

Owner

Shahmatov Arseniy

Almost State-of-the-art Text Generation library

SummerTime - Text Summarization Toolkit for Non-experts

Code for ACL 2022 main conference paper "STEMM: Self-learning with Speech-text Manifold Mixup for Speech Translation".

RuCLIP-SB (Russian Contrastive Language–Image Pretraining SWIN-BERT) is a multimodal model for obtaining images and text similarities and rearranging captions and pictures. Unlike other versions of the model we use BERT for text encoder and SWIN transformer for image encoder.

NLP Overview

基于pytorch+bert的中文事件抽取

An open-source NLP library: fast text cleaning and preprocessing.

Unet-TTS: Improving Unseen Speaker and Style Transfer in One-shot Voice Cloning

TPlinker for NER 中文/英文命名实体识别

Fine-tune GPT-3 with a Google Chat conversation history

Implementation of N-Grammer, augmenting Transformers with latent n-grams, in Pytorch

An easy-to-use framework for BERT models, with trainers, various NLP tasks and detailed annonations

Machine translation models released by the Gourmet project

💛 Code and Dataset for our EMNLP 2021 paper: "Perspective-taking and Pragmatics for Generating Empathetic Responses Focused on Emotion Causes"

An Open-Source Package for Neural Relation Extraction (NRE)

SimCTG - A Contrastive Framework for Neural Text Generation

A python project made to generate code using either OpenAI's codex or GPT-J (Although not as good as codex)

Tensorflow Implementation of A Generative Flow for Text-to-Speech via Monotonic Alignment Search

Unofficial Parallel WaveGAN (+ MelGAN & Multi-band MelGAN & HiFi-GAN & StyleMelGAN) with Pytorch

Wind Speed Prediction using LSTMs in PyTorch