Giter Club home page Giter Club logo

transformers-ru's Introduction

Transformers-ru

A list of pretrained Transformer models for the Russian language (including multilingual models).

Code for the model using and visualisation is from the following repos:

Models

There are models form:

Model description # params Config Vocabulary Model BPE codes
BERT-Base, Multilingual Cased: 104 languages, 12-layer, 768-hidden, 12-heads, 110M parameters 170M [huggingface] 1K [huggingface] 973K [huggingface] 682M
BERT-Base, Multilingual Uncased: 102 languages, 12-layer, 768-hidden, 12-heads, 110M parameters 160M [huggingface] 1K [huggingface] 852K [huggingface] 642M
RuBERT, Russian, cased, 12-layer, 768-hidden, 12-heads, 180M parameters 170M [deeppavlov] 636M
SlavicBERT, Slavic (bg, cs, pl, ru), cased, 12-layer, 768-hidden, 12-heads, 180M parameters 170M [deeppavlov] 636M
XLM (MLM) 15 languages 237M [huggingface] 1K [huggingface] 2,9M
[facebook] 1,5M
[huggingface] 1,3G
[facebook] 1,3G
[huggingface] 1,4M
[facebook] 1,4M
XLM (MLM+TLM) 15 languages 237M [huggingface] 1K [huggingface] 2,9M
[facebook] 1,5M
[huggingface] 661M
[facebook] 665M
[huggingface] 1,4M
[facebook] 1,4M
XLM (MLM) 17 languages [facebook] 2,9M [facebook] 1,1G [facebook] 2,9M
XLM (MLM) 100 languages [facebook] 3,0M [facebook] 1,1G [facebook] 2,9M
Denis Antyukhov BERT-Base, Russian, Uncased, 12-layer, 768-hidden, 12-heads 176M [bert_resourses] 1,9G
Facebook-FAIR's WMT'19 en-ru [fairseq] 12G
Facebook-FAIR's WMT'19 ru-en [fairseq] 12G
Facebook-FAIR's WMT'19 ru [fairseq] 2,1G
Russian RuBERTa [Google Drive] 247M

Converting TensorFlow models to PyTorch

Downloading and converting the DeepPavlov model:

$ wget 'http://files.deeppavlov.ai/deeppavlov_data/bert/rubert_cased_L-12_H-768_A-12_v1.tar.gz'
$ tar -xzf rubert_cased_L-12_H-768_A-12_v1.tar.gz
$ python3 convert_tf_checkpoint_to_pytorch.py \
    --tf_checkpoint_path rubert_cased_L-12_H-768_A-12_v1/bert_model.ckpt \
    --bert_config_file rubert_cased_L-12_H-768_A-12_v1/bert_config.json \
    --pytorch_dump_path rubert_cased_L-12_H-768_A-12_v1/bert_model.bin

Models comparison

There are scripts to train and evaluate models on the Sber SQuAD dataset for the russian language [download dataset].

Comparision of BERT models trained on the Sber SQuAD dataset:

Model EM (dev) F-1 (dev)
BERT-Base, Multilingual Cased 64.85 83.68
BERT-Base, Multilingual Uncased 64.73 83.25
RuBERT 66.38 84.58
SlavicBERT 65.23 83.68
RuBERTa-base 59.45 78.60

Visualization

The attention-head view visualization from BertViz: Attention-head view

[Notebook]

The model view visualization from BertViz: Model view

[Notebook]

The neuron view visualization from BertViz: Neuron view

[Notebook]

Generative models

GPT-2 models

Mikhail Grankin's model

Code: https://github.com/mgrankin/ru_transformers

Download models:

pip install awscli
aws s3 sync --no-sign-request s3://models.dobro.ai/gpt2/ru/unfreeze_all gpt2

Vladimir Larin's model

RNN Models

There are some RNN models for russian language.

ELMo

  • RNC and Wikipedia. December 2018 (tokens): [model]
  • RNC and Wikipedia. December 2018 (lemmas): [model]
  • Taiga 2048. December 2019 (lemmas): [model]

ULMFit

transformers-ru's People

Contributors

vlarine avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

transformers-ru's Issues

ElMo models

Add a section for non-transformers. Add links to russian ELMo models.

how to use ?

could you provide any inference example ?

i try someting like this (after downloaded ROBERTA model):

from transformers import TrOCRProcessor, VisionEncoderDecoderModel
from PIL import Image
from functools import wraps
import time

def job():
    image=Image.open("Screenshot_1.jpg")   
    #model_version = "microsoft/trocr-small-printed"

    model_version = "ruberta_base"   
   
    processor = TrOCRProcessor.from_pretrained(model_version)
    
    pixel_values = processor(image, return_tensors="bin").pixel_values

    model = VisionEncoderDecoderModel.from_pretrained(model_version)
    generated_ids = model.generate(pixel_values)
    generated_text = processor.batch_decode(generated_ids, skip_special_tokens=True)[0]
    print(generated_text)
job()

but failed with:
OSError: ruberta_base does not appear to have a file named preprocessor_config.json. Checkout 'https://huggingface.co/ruberta_base/main' for available files.

use with transformers

from transformers import AutoTokenizer, AutoModelForCausalLM

path = './gpt2/m_checkpoint-3364613'

tokenizer = AutoTokenizer.from_pretrained(path)
model = AutoModelForCausalLM.from_pretrained(path)

image

OSError: Model name './gpt2/m_checkpoint-3364613' was not found in tokenizers model name list (gpt2, gpt2-medium, gpt2-large, gpt2-xl, distilgpt2). We assumed './gpt2/m_checkpoint-3364613' was a path, a model identifier, or url to a directory containing vocabulary files named ['vocab.json', 'merges.txt'] but couldn't find such vocabulary files at this path or url.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.