pacman100 / accelerate-megatron-test Goto Github PK

View Code? Open in Web Editor NEW

10.0 10.0 2.0 139 KB

Testing the accelerate megatron integration

License: Apache License 2.0

Python 98.12% Shell 1.88%

accelerate-megatron-test's Introduction

hey there

About Me :

I'm Sourab Mangrulkar; an Applied Scientist and Machine Learning Engineer from India 🇮🇳.

🔭 I’m currently working as an Applied Scientist at Amazon.
🌱 Exploring Natural Language Processing, Computer Vision and Distributed Training at Scale. Always up for meaningful collaboration.
😄 Pronouns: He/His/Him.
⚡ Painting 🎨, sketching ✍️ and poetry 📝 are my favourite hobbies. Recently, I've started reading up on stocks and economic markets.
📫 How to reach me:

accelerate-megatron-test's People

Contributors

Stargazers

Watchers

Forkers

andrewzhe techthiyanes

accelerate-megatron-test's Issues

Running the provided script returns gpu_ids error

~/accelerate-megatron-test$ ./megatron_lm_gpt_pretraining.sh 
Traceback (most recent call last):
  File "/home/ubuntu/anaconda3/envs/pytorch_p37/bin/accelerate", line 8, in <module>
    sys.exit(main())
  File "/home/ubuntu/anaconda3/envs/pytorch_p37/lib/python3.7/site-packages/accelerate/commands/accelerate_cli.py", line 43, in main
    args.func(args)
  File "/home/ubuntu/anaconda3/envs/pytorch_p37/lib/python3.7/site-packages/accelerate/commands/launch.py", line 747, in launch_command
    defaults = load_config_from_file(args.config_file)
  File "/home/ubuntu/anaconda3/envs/pytorch_p37/lib/python3.7/site-packages/accelerate/commands/config/config_args.py", line 64, in load_config_from_file
    return config_class.from_yaml_file(yaml_file=config_file)
  File "/home/ubuntu/anaconda3/envs/pytorch_p37/lib/python3.7/site-packages/accelerate/commands/config/config_args.py", line 117, in from_yaml_file
    return cls(**config_dict)
TypeError: __init__() got an unexpected keyword argument 'gpu_ids'

I removed the config and ran it on my ec2 machine, by enabling just cpu.

I still ended up with:

./megatron_lm_gpt_pretraining.sh 
Traceback (most recent call last):
  File "/home/ubuntu/anaconda3/envs/pytorch_p37/bin/accelerate", line 8, in <module>
    sys.exit(main())
  File "/home/ubuntu/anaconda3/envs/pytorch_p37/lib/python3.7/site-packages/accelerate/commands/accelerate_cli.py", line 43, in main
    args.func(args)
  File "/home/ubuntu/anaconda3/envs/pytorch_p37/lib/python3.7/site-packages/accelerate/commands/launch.py", line 747, in launch_command
    defaults = load_config_from_file(args.config_file)
  File "/home/ubuntu/anaconda3/envs/pytorch_p37/lib/python3.7/site-packages/accelerate/commands/config/config_args.py", line 64, in load_config_from_file
    return config_class.from_yaml_file(yaml_file=config_file)
  File "/home/ubuntu/anaconda3/envs/pytorch_p37/lib/python3.7/site-packages/accelerate/commands/config/config_args.py", line 117, in from_yaml_file
    return cls(**config_dict)
TypeError: __init__() got an unexpected keyword argument 'megatron_lm_config'

Generation issue for run_clm_no_trainer_with_mmap_dataset.py

I trained and saved model by using run_clm_no_trainer_with_mmap_dataset.py
I follow the megatron_gpt2_generation.py and add following codes on top of run_clm_no_trainer_with_mmap_dataset.py

vocab_file = os.path.join(args.resume_from_checkpoint, "vocab.json")
merge_file = os.path.join(args.resume_from_checkpoint, "merges.txt")
other_megatron_args = {"vocab_file": vocab_file, "merge_file": merge_file}
megatron_lm_plugin = MegatronLMPlugin(other_megatron_args=other_megatron_args)
accelerator = Accelerator(
        gradient_accumulation_steps=args.gradient_accumulation_steps,
        megatron_lm_plugin=megatron_lm_plugin,
        **accelerator_log_kwargs,
    )

.
.
.

tokenizer.pad_token = tokenizer.eos_token
max_new_tokens = 64
batch_texts = [
    "Are you human?",
    ...
    ...

but has issue

generated_tokens = model.megatron_generate(
  File "/usr/local/src/accelerate/src/accelerate/utils/megatron_lm.py", line 1275, in megatron_generate
    raise ValueError("Vocab file is required for inference")
ValueError: Vocab file is required for inference

Even I use tokenizer.save_pretrained() to save the vocal.json to the resume_from_checkpoint, it seems accelerator.load_state() overrides the Vocab file to None.

I also directly use megatron_gpt2_generation.py but have issue

AssertionError: tokenizer_type value from checkpoint (None) is not equal to the input argument value (GPT2BPETokenizer).

I think because it is indexed dataset already tokenized so when accelerator.save_state(), it doesn't save this information, but when use megatron_generate(), it actually needs vocab.json information.
Is it a bug or what is the right way to do generation?

pacman100 / accelerate-megatron-test Goto Github PK

accelerate-megatron-test's Introduction

hey there

About Me :

📝 Research :

✍️ Blog Posts :

💬 Talks and Presentations

accelerate-megatron-test's People

Contributors

Stargazers

Watchers

Forkers

accelerate-megatron-test's Issues

Running the provided script returns gpu_ids error

Generation issue for run_clm_no_trainer_with_mmap_dataset.py

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent