Giter Club home page Giter Club logo

Comments (16)

Masood-Salik avatar Masood-Salik commented on August 26, 2024 14

Does anyone know whats the problem with this??
torchrun --nproc_per_node 1 example.py --ckpt_dir ./weights/7B --tokenizer_path ./weights/tokenizer.model
I got this error only with no details: failed to create process.

from llama.

neuhaus avatar neuhaus commented on August 26, 2024 12

OK, i also cannot get it to run with "torchrun", i get "failed to create process".

Edit:
It does work with Linux. The workaround for Windows with python 3.9.x is to run

python -m torch.distributed.run instead of torchrun.

from llama.

lurenss avatar lurenss commented on August 26, 2024 1

I found the error to fix it you have to point the model and the tokenizer e.g
torchrun --nproc_per_node 1 example.py --ckpt_dir ./model/7B --tokenizer_path ./model/tokenizer.model

from llama.

felipehime avatar felipehime commented on August 26, 2024

I got same error here. Loading checkpoint for MP=0 but world size is 1.

checkpoints variable is also blank when I checked. Like []

Dunno what's happening. By the way, is MP the number of GPU's in a single node?

from llama.

felipehime avatar felipehime commented on August 26, 2024

Is there a file named tokenizer.model? I just got a params.json

from llama.

lurenss avatar lurenss commented on August 26, 2024

In the folder where you downloaded the model you have the model e.g 7B and also tokenizer.model

from llama.

felipehime avatar felipehime commented on August 26, 2024

Well... this is odd. I got checklist.chk, consolidate.pth and params.json there nos tokenizer.model ;/

from llama.

felipehime avatar felipehime commented on August 26, 2024

Found it! but yet the problem persist.

from llama.

felipehime avatar felipehime commented on August 26, 2024

Ok, problem solved! It was a path problem lol

from llama.

jeonbik avatar jeonbik commented on August 26, 2024

Here is how I got things working,
As per (#41 (comment)), edit download.sh
run ./download.sh
Once you have checkpoints for any model, eg:7B, run

torchrun --nproc_per_node 1 example.py --ckpt_dir ./model/7B --tokenizer_path ./model/tokenizer.model

from llama.

emmiehine avatar emmiehine commented on August 26, 2024

Ok, problem solved! It was a path problem lol

@felipehime what was the path issue? I'm getting the same error even pointing the command explicitly to the directories.

from llama.

kiritoyu avatar kiritoyu commented on August 26, 2024

i download the model without 7B file,why?

from llama.

felipehime avatar felipehime commented on August 26, 2024

Ok, problem solved! It was a path problem lol

@felipehime what was the path issue? I'm getting the same error even pointing the command explicitly to the directories.

specifically the path of tokenizer.model

from llama.

ka4on avatar ka4on commented on August 26, 2024

Does anyone know whats the problem with this?? torchrun --nproc_per_node 1 example.py --ckpt_dir ./weights/7B --tokenizer_path ./weights/tokenizer.model I got this error only with no details: failed to create process.

i also have the same problem, any solutions? Thank you!

from llama.

albertodepaola avatar albertodepaola commented on August 26, 2024

Closing as original author solved the issue. Feel free to open new issues with specific details on what you are facing for additional guidance. For future reference, check both llama and llama-recipes repos for getting started guides.

from llama.

 avatar commented on August 26, 2024

Does anyone know whats the problem with this?? torchrun --nproc_per_node 1 example.py --ckpt_dir ./weights/7B --tokenizer_path ./weights/tokenizer.model I got this error only with no details: failed to create process.

i also have the same problem, any solutions? Thank you!

do you got any solution yet ?

from llama.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.