Giter Club home page Giter Club logo

Comments (5)

efrantar avatar efrantar commented on July 19, 2024

Hi, --save saves a HuggingFace checkpoint of the sparse model where sparse weights are exactly 0. In principle, you should be able to use this with an appropriate finetuning script, however if you want to keep the model sparse, you have to make sure that the exactly 0 weights remain 0. A simple way to accomplish this is to store the mask at the beginning (e.g. via p == 0 for each parameter in the saved checkpoint) and then zero out the corresponding weights directly after each gradient update (i.e., after each optimizer.step()).

from sparsegpt.

kiucho avatar kiucho commented on July 19, 2024

Thanks for your reply!
Let me ask you one more question.

Below is the code that outputs the number of parameters for the dense and pruned models.
image

  1. If their number of parameters are exactly same, is there no advantage of model size?

  2. When I set the sparsity to 0.5, 50% of all parameters are set to zero, but it seems that those parameters are still involved in the multiplication and addition operations during the forward pass. So is there a computational cost benefit to the pruned model?

Thank you.

from sparsegpt.

moonlightian avatar moonlightian commented on July 19, 2024

Sparsity is one kind of unstructured prune method and it would not change the size of models.
As for the second question, I am interested too. It seems that computational cost benefit is not optimal but the model can be speedup by CUTLASS.
Maybe not correct, looking forward to author's reply

from sparsegpt.

SSshuishui avatar SSshuishui commented on July 19, 2024

Maybe can use some optimized storage methods to save the sparsity model? Otherwise, saving the large model will become a problem such as the 175B model.

from sparsegpt.

kiucho avatar kiucho commented on July 19, 2024

Now, speedup with CUTLASS is available with Pytorch 2.1. but storage issue is unlikely to be resolved until something else comes along.

Closing this issue.

from sparsegpt.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.