Giter Club home page Giter Club logo

Comments (6)

voidful avatar voidful commented on September 1, 2024

I tested on Colab and everything worked fine. It looks like you're using bf16. May I know what model you're using?

from textrl.

wac81 avatar wac81 commented on September 1, 2024

i use this model:
checkpoint = "bigscience/bloom-560m"

from textrl.

wac81 avatar wac81 commented on September 1, 2024

and if i use gp2, i get new ERROR like this:
actions = torch.tensor([b["action"] for b in dataset], device=device)
Traceback (most recent call last):
File "/home/wac/.pyenv/versions/3.8.12/lib/python3.8/runpy.py", line 194, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/home/wac/.pyenv/versions/3.8.12/lib/python3.8/runpy.py", line 87, in _run_code
exec(code, run_globals)
File "/home/wac/.vscode-server/extensions/ms-python.python-2022.8.1/pythonFiles/lib/python/debugpy/main.py", line 45, in
cli.main()
File "/home/wac/.vscode-server/extensions/ms-python.python-2022.8.1/pythonFiles/lib/python/debugpy/../debugpy/server/cli.py", line 444, in main
run()
File "/home/wac/.vscode-server/extensions/ms-python.python-2022.8.1/pythonFiles/lib/python/debugpy/../debugpy/server/cli.py", line 285, in run_file
runpy.run_path(target_as_str, run_name=compat.force_str("main"))
File "/home/wac/.pyenv/versions/3.8.12/lib/python3.8/runpy.py", line 265, in run_path
return _run_module_code(code, init_globals, run_name,
File "/home/wac/.pyenv/versions/3.8.12/lib/python3.8/runpy.py", line 97, in _run_module_code
_run_code(code, mod_globals, init_globals,
File "/home/wac/.pyenv/versions/3.8.12/lib/python3.8/runpy.py", line 87, in _run_code
exec(code, run_globals)
File "/data/TextRL/train_bloom.py", line 47, in
agent.observe(obs, reward, done, reset)
File "/data/TextRL/env/lib/python3.8/site-packages/pfrl/agent.py", line 164, in observe
self.batch_observe([obs], [reward], [done], [reset])
File "/data/TextRL/env/lib/python3.8/site-packages/pfrl/agents/ppo.py", line 684, in batch_observe
self._batch_observe_train(batch_obs, batch_reward, batch_done, batch_reset)
File "/data/TextRL/env/lib/python3.8/site-packages/pfrl/agents/ppo.py", line 810, in _batch_observe_train
self._update_if_dataset_is_ready()
File "/data/TextRL/textrl/actor.py", line 194, in _update_if_dataset_is_ready
self._update(dataset)
File "/data/TextRL/env/lib/python3.8/site-packages/pfrl/agents/ppo.py", line 490, in _update
distribs, vs_pred = self.model(states)
File "/data/TextRL/env/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
File "/data/TextRL/env/lib/python3.8/site-packages/pfrl/nn/branched.py", line 30, in forward
return tuple(mod(*args, **kwargs) for mod in self.child_modules)
File "/data/TextRL/env/lib/python3.8/site-packages/pfrl/nn/branched.py", line 30, in
return tuple(mod(*args, **kwargs) for mod in self.child_modules)
File "/data/TextRL/env/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
File "/data/TextRL/env/lib/python3.8/site-packages/torch/nn/modules/container.py", line 204, in forward
input = module(input)
File "/data/TextRL/env/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
File "/data/TextRL/textrl/actor.py", line 153, in forward
return torch.distributions.Categorical(logits=logits)
File "/data/TextRL/env/lib/python3.8/site-packages/torch/distributions/categorical.py", line 66, in init
super(Categorical, self).init(batch_shape, validate_args=validate_args)
File "/data/TextRL/env/lib/python3.8/site-packages/torch/distributions/distribution.py", line 56, in init
raise ValueError(
ValueError: Expected parameter logits (Tensor of shape (3, 2, 50257)) of distribution Categorical(logits: torch.Size([3, 2, 50257])) to satisfy the constraint IndependentConstraint(Real(), 1), but found invalid values:
tensor([[[nan, nan, nan, ..., nan, nan, nan],
[nan, nan, nan, ..., nan, nan, nan]],

    [[nan, nan, nan,  ..., nan, nan, nan],
     [nan, nan, nan,  ..., nan, nan, nan]],

    [[nan, nan, nan,  ..., nan, nan, nan],
     [nan, nan, nan,  ..., nan, nan, nan]]], device='cuda:0',
   grad_fn=<SubBackward0>)

from textrl.

wac81 avatar wac81 commented on September 1, 2024

i follow your elon musk example.

from textrl.

voidful avatar voidful commented on September 1, 2024

be careful to the learning rate when fine-tuning via RL, setting a lower learning rate should be helpful
here is the colab example, both model are working:

colab example: bigscience/bloom-560m

colab exmaple: huggingtweets/elonmusk

from textrl.

wac81 avatar wac81 commented on September 1, 2024

thank you, i found out my error from load model with causlLM loader

from textrl.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.