xhluca / dl-translate Goto Github PK

View Code? Open in Web Editor NEW

410.0 6.0 39.0 687 KB

Library for translating between 200 languages. Built on 🤗 transformers.

Home Page: https://xhluca.github.io/dl-translate/

License: MIT License

Python 99.21% Jinja 0.79%

dl-translate's Introduction

DL Translate

A translation library for 200 languages built on Huggingface transformers

💻 GitHub Repository
📚 Documentation
🐍 PyPi project
🧪 Colab Demo / Kaggle Demo

Quickstart

Install the library with pip:

pip install dl-translate

To translate some text:

import dl_translate as dlt

mt = dlt.TranslationModel()  # Slow when you load it for the first time

text_hi = "संयुक्त राष्ट्र के प्रमुख का कहना है कि सीरिया में कोई सैन्य समाधान नहीं है"
mt.translate(text_hi, source=dlt.lang.HINDI, target=dlt.lang.ENGLISH)

Above, you can see that dlt.lang contains variables representing each of the 200 available languages with auto-complete support. Alternatively, you can specify the language (e.g. "Arabic") or the language code (e.g. "fr" for French):

text_ar = "الأمين العام للأمم المتحدة يقول إنه لا يوجد حل عسكري في سوريا."
mt.translate(text_ar, source="Arabic", target="fr")

If you want to verify whether a language is available, you can check it:

print(mt.available_languages())  # All languages that you can use
print(mt.available_codes())  # Code corresponding to each language accepted
print(mt.get_lang_code_map())  # Dictionary of lang -> code

Usage

Selecting a device

When you load the model, you can specify the device:

mt = dlt.TranslationModel(device="auto")

By default, the value will be device="auto", which means it will use a GPU if possible. You can also explicitly set device="cpu" or device="gpu", or some other strings accepted by torch.device(). In general, it is recommend to use a GPU if you want a reasonable processing time.

Choosing a different model

By default, the m2m100 model will be used. However, there are a few options:

mBART-50 Large: Allows translations across 50 languages.
m2m100: Allows translations across 100 languages.
nllb-200 (New in v0.3): Allows translations across 200 languages, and is faster than m2m100 (On RTX A6000, we can see speed up of 3x).

Here's an example:

# The default approval
mt = dlt.TranslationModel("m2m100")  # Shorthand
mt = dlt.TranslationModel("facebook/m2m100_418M")  # Huggingface repo

# If you want to use mBART-50 Large
mt = dlt.TranslationModel("mbart50")
mt = dlt.TranslationModel("facebook/mbart-large-50-many-to-many-mmt")

# Or NLLB-200 (faster and has 200 languages)
mt = dlt.TranslationModel("nllb200")
mt = dlt.TranslationModel("facebook/nllb-200-distilled-600M")

Note that the language code will change depending on the model family. To find out the correct language codes, please read the doc page on available languages or run mt.available_codes().

By default, dlt.TranslationModel will download the model from the huggingface repo for mbart50, m2m100, or nllb200 and cache it. It's possible to load the model from a path or a model with a similar format, but you will need to specify the model_family:

mt = dlt.TranslationModel("/path/to/model/directory/", model_family="mbart50")
mt = dlt.TranslationModel("facebook/m2m100_1.2B", model_family="m2m100")
mt = dlt.TranslationModel("facebook/nllb-200-distilled-600M", model_family="nllb200")

Notes:

Make sure your tokenizer is also stored in the same directory if you load from a file.
The available languages will change if you select a different model, so you will not be able to leverage dlt.lang or dlt.utils.

Splitting into sentences

It is not recommended to use extremely long texts as it takes more time to process. Instead, you can try to break them down into sentences with the help of nltk. First install the library with pip install nltk, then run:

import nltk

nltk.download("punkt")

text = "Mr. Smith went to his favorite cafe. There, he met his friend Dr. Doe."
sents = nltk.tokenize.sent_tokenize(text, "english")  # don't use dlt.lang.ENGLISH
" ".join(mt.translate(sents, source=dlt.lang.ENGLISH, target=dlt.lang.FRENCH))

Batch size during translation

It's possible to set a batch size (i.e. the number of elements processed at once) for mt.translate and whether you want to see the progress bar or not:

# ...
mt = dlt.TranslationModel()
mt.translate(text, source, target, batch_size=32, verbose=True)

If you set batch_size=None, it will compute the entire text at once rather than splitting into "chunks". We recommend lowering batch_size if you do not have a lot of RAM or VRAM and run into CUDA memory error. Set a higher value if you are using a high-end GPU and the VRAM is not fully utilized.

`dlt.utils` module

An alternative to mt.available_languages() is the dlt.utils module. You can use it to find out which languages and codes are available:

print(dlt.utils.available_languages('mbart50'))  # All languages that you can use
print(dlt.utils.available_codes('m2m100'))  # Code corresponding to each language accepted
print(dlt.utils.get_lang_code_map('nllb200'))  # Dictionary of lang -> code

Offline usage

Unlike the Google translate or MSFT Translator APIs, this library can be fully used offline. However, you will need to first download the packages and models, and move them to your offline environment to be installed and loaded inside a venv.

First, run in your terminal:

mkdir dlt
cd dlt
mkdir libraries
pip download -d libraries/ dl-translate

Once all the required packages are downloaded, you will need to use huggingface hub to download the files. Install it with pip install huggingface-hub. Then, run inside Python:

import shutil
import huggingface_hub as hub

dirname = hub.snapshot_download("facebook/m2m100_418M")
shutil.copytree(dirname, "cached_model_m2m100")  # Copy to a permanent folder

Now, move everything in the dlt directory to your offline environment. Create a virtual environment and run the following in terminal:

pip install --no-index --find-links libraries/ dl-translate

Now, run inside Python:

import dl_translate as dlt

mt = dlt.TranslationModel("cached_model_m2m100", model_family="m2m100")

Advanced

If you have knowledge of PyTorch and Huggingface Transformers, you can access advanced aspects of the library for more customization:

Saving and loading: If you wish to accelerate the loading time the translation model, you can use save_obj and reload it later with load_obj. This method is only recommended if you are familiar with huggingface and torch; please read the docs for more information.
Interacting with underlying model and tokenizer: When initializing model, you can pass in arguments for the underlying BART model and tokenizer with model_options and tokenizer_options respectively. You can also access the underlying transformers with mt.get_transformers_model().
Keyword arguments for the generate() method: When running mt.translate, you can also give generation_options that is passed to the generate() method of the underlying transformer model.

For more information, please visit the advanced section of the user guide.

Acknowledgement

dl-translate is built on top of Huggingface's implementation of two models created by Facebook AI Research.

The multilingual BART finetuned on many-to-many translation of over 50 languages, which is documented here The original paper was written by Tang et. al from Facebook AI Research; you can find it here and cite it using the following:

@article{tang2020multilingual,
    title={Multilingual translation with extensible multilingual pretraining and finetuning},
    author={Tang, Yuqing and Tran, Chau and Li, Xian and Chen, Peng-Jen and Goyal, Naman and Chaudhary, Vishrav and Gu, Jiatao and Fan, Angela},
    journal={arXiv preprint arXiv:2008.00401},
    year={2020}
}

The transformer model published in Beyond English-Centric Multilingual Machine Translation by Fan et. al, which supports over 100 languages. You can cite it here:

@misc{fan2020englishcentric,
     title={Beyond English-Centric Multilingual Machine Translation}, 
     author={Angela Fan and Shruti Bhosale and Holger Schwenk and Zhiyi Ma and Ahmed El-Kishky and Siddharth Goyal and Mandeep Baines and Onur Celebi and Guillaume Wenzek and Vishrav Chaudhary and Naman Goyal and Tom Birch and Vitaliy Liptchinsky and Sergey Edunov and Edouard Grave and Michael Auli and Armand Joulin},
     year={2020},
     eprint={2010.11125},
     archivePrefix={arXiv},
     primaryClass={cs.CL}
 }

The no language left behind model, which extends NMT to 200+ languages. You can cite it here:

@misc{nllbteam2022language,
    title={No Language Left Behind: Scaling Human-Centered Machine Translation}, 
    author={NLLB Team and Marta R. Costa-jussà and James Cross and Onur Çelebi and Maha Elbayad and Kenneth Heafield and Kevin Heffernan and Elahe Kalbassi and Janice Lam and Daniel Licht and Jean Maillard and Anna Sun and Skyler Wang and Guillaume Wenzek and Al Youngblood and Bapi Akula and Loic Barrault and Gabriel Mejia Gonzalez and Prangthip Hansanti and John Hoffman and Semarley Jarrett and Kaushik Ram Sadagopan and Dirk Rowe and Shannon Spruit and Chau Tran and Pierre Andrews and Necip Fazil Ayan and Shruti Bhosale and Sergey Edunov and Angela Fan and Cynthia Gao and Vedanuj Goswami and Francisco Guzmán and Philipp Koehn and Alexandre Mourachko and Christophe Ropers and Safiyyah Saleem and Holger Schwenk and Jeff Wang},
    year={2022},
    eprint={2207.04672},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}

dlt is a wrapper with useful utils to save you time. For huggingface's transformers, the following snippet is shown as an example:

from transformers import MBartForConditionalGeneration, MBart50TokenizerFast

article_hi = "संयुक्त राष्ट्र के प्रमुख का कहना है कि सीरिया में कोई सैन्य समाधान नहीं है"
article_ar = "الأمين العام للأمم المتحدة يقول إنه لا يوجد حل عسكري في سوريا."

model = MBartForConditionalGeneration.from_pretrained("facebook/mbart-large-50-many-to-many-mmt")
tokenizer = MBart50TokenizerFast.from_pretrained("facebook/mbart-large-50-many-to-many-mmt")

# translate Hindi to French
tokenizer.src_lang = "hi_IN"
encoded_hi = tokenizer(article_hi, return_tensors="pt")
generated_tokens = model.generate(**encoded_hi, forced_bos_token_id=tokenizer.lang_code_to_id["fr_XX"])
tokenizer.batch_decode(generated_tokens, skip_special_tokens=True)
# => "Le chef de l 'ONU affirme qu 'il n 'y a pas de solution militaire en Syria."

# translate Arabic to English
tokenizer.src_lang = "ar_AR"
encoded_ar = tokenizer(article_ar, return_tensors="pt")
generated_tokens = model.generate(**encoded_ar, forced_bos_token_id=tokenizer.lang_code_to_id["en_XX"])
tokenizer.batch_decode(generated_tokens, skip_special_tokens=True)
# => "The Secretary-General of the United Nations says there is no military solution in Syria."

With dlt, you can run:

import dl_translate as dlt

article_hi = "संयुक्त राष्ट्र के प्रमुख का कहना है कि सीरिया में कोई सैन्य समाधान नहीं है"
article_ar = "الأمين العام للأمم المتحدة يقول إنه لا يوجد حل عسكري في سوريا."

mt = dlt.TranslationModel()
translated_fr = mt.translate(article_hi, source=dlt.lang.HINDI, target=dlt.lang.FRENCH)
translated_en = mt.translate(article_ar, source=dlt.lang.ARABIC, target=dlt.lang.ENGLISH)

Notice you don't have to think about tokenizers, condition generation, pretrained models, and regional codes; you can just tell the model what to translate!

If you are experienced with huggingface's ecosystem, then you should be familiar enough with the example above that you wouldn't need this library. However, if you've never heard of huggingface or mBART, then I hope using this library will give you enough motivation to learn more about them :)

dl-translate's People

Contributors

Stargazers

Watchers

dl-translate's Issues

Add MarianNMT

See Marian: https://huggingface.co/transformers/model_doc/marian.html
See helsinki-nlp's models: https://huggingface.co/Helsinki-NLP

We'd need

Add option to load the marian architecture at initialization (e.g. dlt.TranslationModel("marian"))
Add an option to find all of the languages (and code) available for a certain variant trained using marian, e.g. dlt.utils.available_languages("opus-en-romance")
An option to leverage autocomplete such as dlt.lang.opus.en_romance.ENGLISH, but the options would be limited to only what's available with the variance (i.e. romance)
TBD

no load to ram mode

hi, it me again, and sorry about my bad English,
I have a project to use this software for windows tablets with 4GB of ram,
the problem is the ram consumption using this software is quite high, about 2,3GB,
is there any way to use this software read storage data(SSD or HDD) instead of ram data?

thank you for reading, and have a nice day

module 'torch' has no attribute 'device'

Hello , @xhlulu
Please find attached the part of the tutorial that I tried to execute and where I find the error.
NB : I used the guide of Pytorch to install torch according to the command appropriate to my system which is: pip3 install torch torchvision torchaudio .
The version of torch is 1.10.1 and my python version is 3.8.5 .

Thank you for your help.

error: when using torch(1.8.0+cu111)

Traceback (most recent call last):

  File "translate_test.py", line 66, in <module>

    translate_test()

  File "translate_test.py", line 30, in translate_test

    rest = mt.predict(texts, _from = 'en',batch_size = size)

  File "/mnt/eclipse-glority/receipt/deploy/branches/dev/ms_deploy/util/translate_util.py", line 29, in predict

    rest = self.mt.translate(texts, source=_from, target=_to, batch_size = batch_size)

  File "/home/hyj/anaconda3/envs/tf25/lib/python3.7/site-packages/dl_translate/_translation_model.py", line 197, in translate

    **encoded, **generation_options

  File "/home/hyj/anaconda3/envs/tf25/lib/python3.7/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context

    return func(*args, **kwargs)

  File "/home/hyj/anaconda3/envs/tf25/lib/python3.7/site-packages/transformers/generation_utils.py", line 927, in generate

    model_kwargs = self._prepare_encoder_decoder_kwargs_for_generation(input_ids, model_kwargs)

  File "/home/hyj/anaconda3/envs/tf25/lib/python3.7/site-packages/transformers/generation_utils.py", line 412, in _prepare_encoder_decoder_kwargs_for_generation

    model_kwargs["encoder_outputs"]: ModelOutput = encoder(input_ids, return_dict=True, **encoder_kwargs)

  File "/home/hyj/anaconda3/envs/tf25/lib/python3.7/site-packages/torch/nn/modules/module.py", line 889, in _call_impl

    result = self.forward(*input, **kwargs)

  File "/home/hyj/anaconda3/envs/tf25/lib/python3.7/site-packages/transformers/models/m2m_100/modeling_m2m_100.py", line 780, in forward

    output_attentions=output_attentions,

  File "/home/hyj/anaconda3/envs/tf25/lib/python3.7/site-packages/torch/nn/modules/module.py", line 889, in _call_impl

    result = self.forward(*input, **kwargs)

  File "/home/hyj/anaconda3/envs/tf25/lib/python3.7/site-packages/transformers/models/m2m_100/modeling_m2m_100.py", line 388, in forward

    hidden_states = self.activation_fn(self.fc1(hidden_states))

  File "/home/hyj/anaconda3/envs/tf25/lib/python3.7/site-packages/torch/nn/modules/module.py", line 889, in _call_impl

    result = self.forward(*input, **kwargs)

  File "/home/hyj/anaconda3/envs/tf25/lib/python3.7/site-packages/torch/nn/modules/linear.py", line 94, in forward

    return F.linear(input, self.weight, self.bias)

  File "/home/hyj/anaconda3/envs/tf25/lib/python3.7/site-packages/torch/nn/functional.py", line 1753, in linear

    return torch._C._nn.linear(input, weight, bias)

RuntimeError: CUDA error: CUBLAS_STATUS_EXECUTION_FAILED when calling `cublasSgemm( handle, opa, opb, m, n, k, &alpha, a, lda, b, ldb, &beta, c, ldc)`

torch                            1.8.0+cu111

torchvision                      0.9.0+cu111

it is ok, when

torch 1.7.1+cu101

how to fix ?

how to make (slow) translation faster

Hi, I am testing this code on a list of 5 short sentences, the average time for translation is 2 seconds/sentence. which is slow for my requirements. any hints on how to speed-up the translation ? Thanks

import dl_translate as dlt
import time 

french_sentence = 'oh mon dieu c mechant c pas possible jamais je reviendrai, a deconseiller. je vous recommende de visiter un autre produit apres vous pouvez voire la difference'
arabic_sentence = '  لقد جربت عدة نسخ من هذا المنتج لكن لم استطع ان اجد فبه ما ينتج ما هذا الهراء'
ar2 = 'المنتج الاصلى سريع الذوبان فى الماء ويذوب بشكل مثالى على عكس المكمل المغشوش ...منتج كويس انا حبيتو و بنصح فيه'
ar3= 'امشي سيدا لفه الثانيه يسار تعدد المطالبات المتعلقة بالأراضي وما ينتج عن ذلك من تناحر يولد باستمرار نزاعات متجددة. ... ويمكن دمج ما ينتج عن ذلك من معارف في إطار برنامج عمل نيروبي' 
nepali ='यो मृत्युदर विकासशील देशहरुमा धेरै छ'
sent_list =[french_sentence, arabic_sentence, ar2, ar3, nepali]
print(sent_list)
mt = dlt.TranslationModel()  # Slow when you load it for the first time
map_langdetect_to_translate = {'ar':'Arabic', 'en':'English', 'es':'Spanish', 'fr':'French', 'ne':'Nepali'}
start = time.time() 
for sent in sent_list:
	print('-------------------------------------')
	print('original sentence is : ',sent)
	print('detected lang ',detect(sent))
	mapped = map_langdetect_to_translate[detect(sent)]
	translated = mt.translate(sent, source=mapped, target="en")
	print('Translation is : ',translated)

end = time.time()	
tt = time.strftime("%H:%M:%S", time.gmtime(end-start))
time_message = 'Query execution time : {}'.format( tt )
print(time_message)

Move dlt.utils.available_languages inside dlt.TranslationModel?

Feels like it would make more sense. Also could move dlt.lang to dlt.TranslationModel.lang, but not sure if auto complete would work here.

Cannot run with device = 'gpu' on Macbook M1 Pro

I have tried to using gpu on Macbook 16inch M1 Pro, then I got this error: "AssertionError: Torch not compiled with CUDA enabled"

Please help!

Detect source language with langdetect package

The langdetect has worked well for me in the past for language detection problems. How would you feel about allowing users to pass 'auto' as an option for source? I could see some pros and cons:

Pros

Users don't need to be able to recognize a language to translate
Eliminates pre-classification of languages if your dataset contains multiple languages

Cons

Adds another dependency
langdetect detects these 55 languages only

I'm a little new to open source but I would love to contribute 🙂 Of course, if you feel this doesn't fit this package's mission that's totally understandable.

I/O mismatch in `translate` method

When calling model.translate, if the input is a string, the output will be a list of string with length 1. We should check that the Input and output match in cases like this.

Generate docs with sphinx or something else

Right now I have some docstrings but it would require some refactoring. Using readthedocs.io would be nice, we could start by looking at what numpy or pydata is using

how to use T5 in dlt?

is there code sample...?
thanks...

I want to test it!

Add support for TPU

That would be nice for Kaggle/Colab/GCP users. Unfortunately I'm not too familiar with XLA so it might take a while before I take a stab at that.

Incorporating ISO-639

Might be worth considering ISO-639-1, ISO-639 Macro. This would have the added benefit of mapping endonyms as well, e.g. dlt.endonym.get("日本語") -> "ja" which would be equivalent to dlt.lang.JAPANESE -> "ja" or something like that.

Some useful links:

Change assigned value for variables in dlt.lang

right now dlt.lang.ENGLISH = "en_XX" which is the fairseq language code. We should use one common notation (or keep that one) and map the "common" notation to each specific language code (fairseq, helsinki-nlp, etc.)

This will help for #2

Add a list of available languages to the docs

Update setup.py's install_requires to use transformers>=4.4.0 when it will be released

See: https://github.com/xhlulu/dl-translate/blob/master/setup.py#L22

...
install_requires=['transformers @ git+https://github.com/huggingface/transformers', 'torch', 'sentencepiece', 'protobuf']

Support for fine-tuning huggingface/mbart models

Hi,

Is there any plan to also provide support for fine-tuning mbart models?

Write tests

pytest might be a good platform for this. Then, we could add github actions for CI

Support for sentence splitting

Right now TranslationModel.translate will translate each input string as is, which can be extremely slow for longer sequences due to the quadratic runtime of the architecture. The current recommended way is to use nltk:

import nltk

nltk.load("punkt")

text = "Mr. Smith went to his favorite cafe. There, he met his friend Dr. Doe."
sents = nltk.tokenize.sent_tokenize(text, "english")  # don't use dlt.lang.ENGLISH
" ".join(model.translate(sents, source=dlt.lang.ENGLISH, target=dlt.lang.FRENCH))

Which works well but doesn't include all possible languages. It would be interesting to train the punkt model on each of the language made available (though we'd need to use a very large dataset for that). Once that's done, the snippet above could be a simple argument, e.g. model.translate(..., max_length="sentence"). With some more effort, max_length parameter could also be an integer n between 0 and 512, which represents the length of the max token. Moreover, rather than truncating at that length, we could break down the input text into sequences of length n or less, which would include the aggregated sentences.

Speed improvements when loading bart model

Currently, it takes 60s+ to load mBART large with from_pretrained. With torch it takes ~20s. A lot to optimize and improve.

Related issues:

Offline mode tutorial

hi, sorry for my bad English, and I am quite a newbie
I am quite confused with the offline tutorial
"Now, move everything in the dlt directory to your offline environment. Create a virtual environment:"
-where is the "offline environment"?
and
-how to Create a "virtual environment"?
I using windows 11 and python 3.9

Add a command line interface

Offline usage can't load tokenizer

I follow the steps to perform offline operation

At the last step，run inside Python:

import dl_translate as dlt

mt = dlt.TranslationModel("cached_model_m2m100", model_family="m2m100")

i get

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python3.10/dist-packages/dl_translate/_translation_model.py", line 126, in __init__
    self._tokenizer = TokenizerFast.from_pretrained(
  File "/usr/local/lib/python3.10/dist-packages/transformers/tokenization_utils_base.py", line 1788, in from_pretrained
    raise EnvironmentError(
OSError: Can't load tokenizer for 'cached_model_m2m100'. If you were trying to load it from 'https://huggingface.co/models', make sure you don't have a local directory with the same name. Otherwise, make sure 'cached_model_m2m100' is the correct path to a directory containing all relevant files for a M2M100Tokenizer tokenizer.

how to fix it
thanks for your help

Noun Keyword not translated properly

I am using the python library dl-translate 0.2.6 version and i am facing the issue while translating the Any Noun.
translation from ENGLISH to HINDI
Scenario 1 :
Input :

text_hi="What is your name ?"
mt.translate(text_hi, source=dlt.lang.ENGLISH, target=dlt.lang.HINDI)
Output
'आपका नाम क्या है?'
Result : Correct

Scenario 2 :
Input :

text_hi="What is your name ?"
mt.translate(text_hi, source=dlt.lang.ENGLISH, target=dlt.lang.HINDI)
Output :
'विपिन कुमर ट्राइपाथी'
Result : Not Correct
Expected Result : 'विपिन कुमार त्रिपाठी'

Scenario 3 :
Input :

text_hi="Piyush Mishra"
mt.translate(text_hi, source=dlt.lang.ENGLISH, target=dlt.lang.HINDI)
Output
'मिश्रा मिश्रा'
Result : Not Correct
Expected Result : 'पीयूष मिश्रा'

Scenario 3 :
Input

text_hi="Banana"
mt.translate(text_hi, source=dlt.lang.ENGLISH, target=dlt.lang.HINDI)
Output
'बादाम'
Result : Not Correct
Expected Result : 'केला'

Scenario 4 :
input :

text_hi="Apple"
mt.translate(text_hi, source=dlt.lang.ENGLISH, target=dlt.lang.HINDI)
Output
'एप्पल'
Result : Not Correct
Expected Result : 'सेब'

@xhluca @bilunsun Please help us to resolve this issue

Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)

When I use dl_translate, the following problem appears, how do I set TOKENIZERS_PARALLELISM.

huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
- Avoid using tokenizers before the fork if possible
- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)

error on pyw extention

hi, it's me again, sorry again for bad English
I tried this code in py file, open using python IDLE, run -> run module F5 ===> no problem
then rename the extension to pyw, open like exe (double click), and this is the result:

Traceback (most recent call last):
  File "D:\Script\translate.pyw", line 67, in FB_Loading
    import dl_translate as dlt
  File "C:\Python\Python39\lib\site-packages\dl_translate\__init__.py", line 3, in <module>
    from ._translation_model import TranslationModel
  File "C:\Python\Python39\lib\site-packages\dl_translate\_translation_model.py", line 5, in <module>
    import transformers
  File "C:\Python\Python39\lib\site-packages\transformers\__init__.py", line 43, in <module>
    from . import dependency_versions_check
  File "C:\Python\Python39\lib\site-packages\transformers\dependency_versions_check.py", line 36, in <module>
    from .file_utils import is_tokenizers_available
  File "C:\Python\Python39\lib\site-packages\transformers\file_utils.py", line 58, in <module>
    logger = logging.get_logger(__name__)  # pylint: disable=invalid-name
  File "C:\Python\Python39\lib\site-packages\transformers\utils\logging.py", line 119, in get_logger
    _configure_library_root_logger()
  File "C:\Python\Python39\lib\site-packages\transformers\utils\logging.py", line 82, in _configure_library_root_logger
    _default_handler.flush = sys.stderr.flush
AttributeError: 'NoneType' object has no attribute 'flush'

any guide to fix this?

Include an option to use batch_size

Right now the model will try to run generate on the entire input list at once, which could result in the device running out of memory. to avoid that, we could allow a batch_size (or chunk_size) parameter to the translate function.

By default it'd be None which would then attempt to run generate on everything. If an integer > 1 is given, it will create batch sizes of that size and iteratively generate the translations.