Comments (2)
It is probably the case that those language variants have too little data to make the model recognize the language label. I did not test it myself but we should probably test this more carefully to make sense of all language labels. So far, this model does not use any over/under-sampling. Maybe some additional fine-tuning for individual language pairs could do the trick?
from opus-mt-train.
By the way, one way to check whether the language label is supported is to grep for the token in the vocabulary file. You could do this to see whether the label is available at all:
grep '>>' *.vocab.yml
from opus-mt-train.
Related Issues (20)
- new Huggingface model Helsinki-NLP/opus-mt-fr-it HOT 5
- how to fine tune the model published on huggingface | opus-mt-en-zh HOT 1
- model Helsinki-NLP/opus-mt-en-uk translates some sentences into Russian instead of Ukrainian HOT 1
- Problem Fine-tuning Models using TMX files HOT 5
- Preprocessing of training data HOT 1
- HuggingFace conversion script doesn't work HOT 1
- Model not available on huggingface model page, how do I use it with huggingface. HOT 4
- Preparing fine-tune data for Marian HOT 2
- Source.spm & Target.spm Files HOT 2
- Bridge Language
- What's the dataset used for training opus-mt-en-de HOT 1
- Language Code Difference HOT 1
- What is tatoeba-langtune? HOT 2
- Preprocessing Script Question
- Korean Finetuning
- Multilingual Tuned Model Translating everything to "sssssssss" HOT 2
- What could cause widely varying inference time when using pre-trained opus-mt-en-fr model with python transformers library? HOT 2
- Wrong tokenizer/vocab for the 'Helsinki-NLP/opus-mt-tc-big-en-ko' model
- How to translate from english to Japan?
- Using OPUS-MT with DeepSpeed
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from opus-mt-train.