Comments (4)
Is there a way I can get the machine-translated prompts per task? / For example, how would I get the Spanish (es) prompt for Paws-x only?
The prompts are here: https://github.com/Muennighoff/promptsource/blob/xp3mt/promptsource/templates/paws-x/es/templates.yaml
Is there a way I can get the input, output pairs for Paws-x only?
You can just download the paws-x files: https://huggingface.co/datasets/bigscience/xP3mt/tree/main/es e.g. https://huggingface.co/datasets/bigscience/xP3mt/blob/main/es/xp3_paws-x_es_train_task_description-no-label_esmt.jsonl
Also see the usage guidelines here that may help: https://huggingface.co/datasets/Muennighoff/xP3x#usage
Also, more generally, how do you do machine-translation for prompts if the language is from right-to-left instead of left-to-right or has different ordering like subject-object-verb instead of subject-verb-object? Would the target come before the input or would you reorder the sentences in the input (i.e premise or hypothesis) in the prompt? And if the target comes before the input, how would the model work since it generates from left to right?
We use Google Machine Translate to translate the prompts and then just put them in the same place for all languages. For right-to-left languages like Arabic everything is the same (i.e. they are processed from beginning of sentence to the end). Usually browsers handle displaying it as right-to-left so we can treat it as left-to-right in the modelling phase.
from xmtf.
Thank you for the quick response and the pointer. It is very helpful.
In the templates https://github.com/Muennighoff/promptsource/blob/xp3mt/promptsource/templates/paws-x/es/templates.yaml
, I saw the metrics was Null, but it seems they have answer choices and the original prompt https://github.com/Muennighoff/promptsource/blob/xp3mt/promptsource/templates/paws/labeled_final/templates.yaml
has Accuracy as a metric? Could I still use Accuracy as a metric for the machine-translated prompts?
from xmtf.
Yes you can use accuracy. The metric field in that file is never used.
from xmtf.
Great. Thank you for all your help!
from xmtf.
Related Issues (20)
- Were the checkpoints selected based on the held-out performance or seen task performance? HOT 2
- How to convert megatron-deepspeed checkpoints to huggingface checkpoints ? HOT 4
- How to fineutne mT0 with specific down-stream data? HOT 3
- Questions on creating instruction data HOT 1
- Use Petals without sharing GPU HOT 11
- Controlled generation HOT 1
- how to convert model weights(e.g., bigscience/bloomz-560m-optimizer-states) to Hugging Face model.bin file? HOT 2
- how to repreduce bloomz-* HOT 6
- mT0-xxl finetuning HOT 6
- bloomz-mt universal checkpoint HOT 2
- Export mt0-xxl-mt to ONNX fails HOT 2
- Dose mt0&bloomz trained on dev, devtest datasets of Flores-200? HOT 2
- Parsing the xP3 dataset HOT 1
- P3megds URL is not available HOT 1
- Some datasets are not in xP3all HOT 4
- What is the training config? HOT 3
- I can't find the model weights that you used for experimentation. HOT 1
- Quesiton about MTFDataset HOT 1
- Why does the number of templates differ between languages? HOT 4
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from xmtf.