Comments (2)
but meta-Llama-3-8B model correct
root@63733705cb83:/ColossalAI/applications/Colossal-LLaMA# python prepare_sft_dataset.py --data_input_dirs /models/train-data-dir --tokenizer_dir /models/llama-2-13b-hf^Ct_dirs /models/train-data-dir/out --num_spliced_dataset_bins 1
root@63733705cb83:/ColossalAI/applications/Colossal-LLaMA# python prepare_sft_dataset.py --data_input_dirs /models/train-data-dir --tokenizer_dir /models/meta-Llama-3-8B --data_output_dirs /models/train-data-dir/out --num_spliced_dataset_bins 1
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
[05/14/24 16:51:56] INFO colossalai - colossalai - INFO: /ColossalAI/applications/Colossal-LLaMA/prepare_sft_dataset.py:101 main
INFO colossalai - colossalai - INFO: Start to process part-0/1 of all original datasets.
Map (num_proc=8): 100%|███████████████████████████████████████████████████████████████████████████████| 8/8 [00:02<00:00, 3.26 examples/s]
Filter: 100%|████████████████████████████████████████████████████████████████████████████████████████| 8/8 [00:00<00:00, 469.98 examples/s]
[05/14/24 16:51:59] INFO colossalai - colossalai - INFO: /ColossalAI/applications/Colossal-LLaMA/prepare_sft_dataset.py:127 main
INFO colossalai - colossalai - INFO: processing 0 spliced data points for
/models/train-data-dir/out/jsonl/part-00000.jsonl
INFO colossalai - colossalai - INFO: /ColossalAI/applications/Colossal-LLaMA/prepare_sft_dataset.py:133 main
INFO colossalai - colossalai - INFO: Start to save /models/train-data-dir/out/arrow/part-00000
Setting num_proc from 48 back to 1 for the train split to disable multiprocessing as it only contains one shard.
Generating train split: 8 examples [00:00, 967.99 examples/s]
Saving the dataset (8/8 shards): 100%|████████████████████████████████████████████████████████████████| 8/8 [00:00<00:00, 36.59 examples/s]
from colossalai.
Hi,
As the bos
, eos
tokens are different for llama2 and llama3, so you need to choose which version of llama you want to process.
We have a parameter to set version of llama here:
The default one is llama3.
Also you will need to change the default conversation template to llama2 at here:
We will fix the issue. Thanks.
from colossalai.
Related Issues (20)
- [BUG]: ValueError: mutable default <class 'colossalai.legacy.tensor.distspec._DistSpec'> for field dist_attr is not allowed: use default_factory HOT 1
- [BUG]: AttributeError: type object 'ColoParameter' has no attribute 'from_torch_tensor' when run hybrid_parallel example HOT 3
- [FEATURE]: Support qwen2 model
- [BUG]: OOM when saving 70B model HOT 2
- [DOC]: What is the datasetset used to train the Colossal-Llama-2? HOT 1
- [BUG]: Running ColossalAI in H800 with torch 2.0 HOT 29
- [BUG]: pretraing llama2 using "gemini" plugin, can not resume from saved checkpoints HOT 1
- [BUG] [Shardformer]: Error in blip2 testing with half precision HOT 1
- [FEATURE]: support multiple (partial) backward passes for zero
- [BUG]: re-join str type error_msgs using `\n\t` in general_checkpoint_io
- how to wrapped multiple models with booster HOT 3
- [BUG]: ColossalMoE Train: AssertionError: Parameters are expected to have the same dtype `torch.bfloat16`, but got `torch.float32` HOT 1
- [PROPOSAL]: Fix potential github action smells
- Does colossalai support rocm? HOT 2
- [BUG]: Slack link is invalid HOT 1
- [BUG]: GROK-1 does not support do_sample
- [BUG]: TypeError: _gen_python_code() got an unexpected keyword argument 'verbose' HOT 2
- [BUG]: llama2 hybrid_parallel or 3d giving None loss when using pp_size > 1 HOT 6
- [DOC]: torch-version HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from colossalai.