Giter Club home page Giter Club logo

Comments (2)

maxreciprocate avatar maxreciprocate commented on September 13, 2024

Hello @nhanph!

No, the removal is not useless. If you check contents of python_model.bin immediately after this line:

self.accelerator.save_state(dst_dir, **kwargs)
you will see that it contains the whole model state dictionary, which is not needed. After deleting it and recreating it with model.save_pretrained with heads_only=True, only value heads will be kept there.

>>> list(before_deletion_state_dict.keys())[:32]
['v_head.0.weight', 'v_head.0.bias', 'v_head.2.weight', 'v_head.2.bias', 'base_model.model.transformer.wte.weight', 'base_model.model.transformer.wpe.weight', 'base_model.model.transformer.h.0.ln_1.weight', 'base_model.model.transformer.h.0.ln_1.bias', 'base_model.model.transformer.h.0.attn.c_attn.weight', 'base_model.model.transformer.h.0.attn.c_attn.bias', 'base_model.model.transformer.h.0.attn.c_attn.lora_A.default.weight', 'base_model.model.transformer.h.0.attn.c_attn.lora_B.default.weight', 'base_model.model.transformer.h.0.attn.c_proj.weight', 'base_model.model.transformer.h.0.attn.c_proj.bias', 'base_model.model.transformer.h.0.ln_2.weight', 'base_model.model.transformer.h.0.ln_2.bias', 'base_model.model.transformer.h.0.mlp.c_fc.weight', 'base_model.model.transformer.h.0.mlp.c_fc.bias', 'base_model.model.transformer.h.0.mlp.c_proj.weight', 'base_model.model.transformer.h.0.mlp.c_proj.bias', 'base_model.model.transformer.h.1.ln_1.weight', 'base_model.model.transformer.h.1.ln_1.bias', 'base_model.model.transformer.h.1.attn.c_attn.weight', 'base_model.model.transformer.h.1.attn.c_attn.bias', 'base_model.model.transformer.h.1.attn.c_attn.lora_A.default.weight', 'base_model.model.transformer.h.1.attn.c_attn.lora_B.default.weight', 'base_model.model.transformer.h.1.attn.c_proj.weight', 'base_model.model.transformer.h.1.attn.c_proj.bias', 'base_model.model.transformer.h.1.ln_2.weight', 'base_model.model.transformer.h.1.ln_2.bias', 'base_model.model.transformer.h.1.mlp.c_fc.weight', 'base_model.model.transformer.h.1.mlp.c_fc.bias']
>>> list(after_save_pretrained_state_dict.keys())[:32]
['v_head.0.weight', 'v_head.0.bias', 'v_head.2.weight', 'v_head.2.bias']

from trlx.

nhanph avatar nhanph commented on September 13, 2024

Thank you @maxreciprocate , I got the point about saving model's value head now.

My original question is from my observation when running ILQL training script that I see a pytorch_model.bin with the size comparable with the original model so I suspect that the base model is also saved. Is there somewhere that the heads_only flag is set to true during checkpointing when using peft_config as I cannot find it set anywhere?

from trlx.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.