Comments (5)
Hi, --save
saves a HuggingFace checkpoint of the sparse model where sparse weights are exactly 0. In principle, you should be able to use this with an appropriate finetuning script, however if you want to keep the model sparse, you have to make sure that the exactly 0 weights remain 0. A simple way to accomplish this is to store the mask at the beginning (e.g. via p == 0
for each parameter in the saved checkpoint) and then zero out the corresponding weights directly after each gradient update (i.e., after each optimizer.step()
).
from sparsegpt.
Thanks for your reply!
Let me ask you one more question.
Below is the code that outputs the number of parameters for the dense and pruned models.
-
If their number of parameters are exactly same, is there no advantage of model size?
-
When I set the sparsity to 0.5, 50% of all parameters are set to zero, but it seems that those parameters are still involved in the multiplication and addition operations during the forward pass. So is there a computational cost benefit to the pruned model?
Thank you.
from sparsegpt.
Sparsity is one kind of unstructured prune method and it would not change the size of models.
As for the second question, I am interested too. It seems that computational cost benefit is not optimal but the model can be speedup by CUTLASS.
Maybe not correct, looking forward to author's reply
from sparsegpt.
Maybe can use some optimized storage methods to save the sparsity model? Otherwise, saving the large model will become a problem such as the 175B model.
from sparsegpt.
Now, speedup with CUTLASS is available with Pytorch 2.1. but storage issue is unlikely to be resolved until something else comes along.
Closing this issue.
from sparsegpt.
Related Issues (20)
- OOM:cannot download opt-30b, opt-66b
- How should I verify the speedup effect of the algorithm? HOT 4
- Purpose of this update
- Inference Speedup HOT 3
- Dependencies are wrong HOT 3
- Would sparsegpt be available for Llama2? HOT 3
- When would the code for GPT-J-6B be released?
- Adaptation for Pruning Conv2d or Conv3d Layers? HOT 1
- Can SparseGPT be used on BERT ?
- Using llama.py silently fails and occasionally causes system instability
- transformers version is not correct
- Mistral Support HOT 2
- how to use for Baichuan?
- 2:4 sparsity with to_sparse_semi_structured method from pytorch results in memory issue
- Why Hessian can get by activation ($H = XX^T$) ? HOT 4
- Why transpose the input when in case of nn.Linear or nn.Conv1d?
- AttributeError: 'NoneType' object has no attribute 'shape' HOT 7
- AWQ alongside sparsegpt
- why i can't reproduce the result of paper? HOT 7
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from sparsegpt.