Comments (19)
Which version of Transformers.jl and Pickle.jl?
from transformers.jl.
(@v1.9) pkg> status Transformers
Status ~/.julia/environments/v1.9/Project.toml
Transformers v0.2.8
Pickle v0.3.2
from transformers.jl.
Ok, so you would need to update Pickle.jl to 0.3.3 which adds support for bfloat16.
from transformers.jl.
pkg>update Pickle
wouldn't do it.
from transformers.jl.
on linux / ubuntu ..is this platform specific ?
from transformers.jl.
Not really. What happens if you explicitly add [email protected]
? I'm guessing there are some compat issues that block the update.
from transformers.jl.
(@v1.9) pkg> add [email protected]
Resolving package versions...
ERROR: Unsatisfiable requirements detected for package ReinforcementLearningZoo [d607f57d]:
from transformers.jl.
I will create a local project and activate and see if it works ..
from transformers.jl.
Similar warning as before ...
textenc = hgf"databricks/dolly-v2-12b:tokenizer"
┌ Warning: fuse_unk is unsupported, the tokenization result might be slightly different in some cases.
└ @ Transformers.HuggingFace ~/.julia/packages/Transformers/lD5nW/src/huggingface/tokenizer/utils.jl:42
will load the model and see if it works.
from transformers.jl.
This time ...no error , but model hangs ..
model = todevice(hgf"databricks/dolly-v2-12b":ForCausalLM")
from transformers.jl.
The warning can usually be ignored
model = todevice(hgf"databricks/dolly-v2-12b":ForCausalLM")
That is a big model, which takes time to be moved to GPU. And there is an extra "
.
from transformers.jl.
I have a 3080 ti ..will wait to see and let you know . Thank you. I was able to run Ollama with other 13b models fairly quickly.
from transformers.jl.
finally it put out an error:
ERROR: LoadError: syntax: cannot juxtapose string literal
from transformers.jl.
That is because the extra "
from transformers.jl.
Let me see where it is happening ... it is this line:
model = todevice(hgf"databricks/dolly-v2-12b:ForCausalLM")
from transformers.jl.
The other thing , I noticed ..nvidia-smi was constant ...almost telling it was not copying there ..
from transformers.jl.
I will copy into a vim editor and try it out ..that way , double quotes sometimes comes up with special chars...that can be eliminated as a cause.
from transformers.jl.
Still no luck. It just kills my shell after a while . Will come back and try later .
For now I will stick to just using OpenAI.jl and continue my work.
Thank you for trying to help.
from transformers.jl.
It sounds like the process might be killed due to OOM.
So currently you would need about 70GB CPU memory to load the 12B
model. This is actually larger than the size of model weights, due to our implementation detail. The model weight is directly copied from disk to memory, and during the construction of the model object, it makes another copy on CPU. So at the end it would take at least 2 times (or more, depending on the data type) larger memory than the size of model weights
from transformers.jl.
Related Issues (20)
- Gpt2 tokenizer does not support different vocab_size HOT 1
- Adding support for checkpointing HOT 12
- update NNlib and Flux compat HOT 9
- State of quantization HOT 3
- OWL-ViT HOT 1
- AMDGPU support HOT 1
- DistilBertModel support HOT 1
- Attempting to download CLIP yields UnderVarError `unk_token` not defined
- Performance issue HOT 1
- [Question] Possible to retrieve layer-wise activations? HOT 4
- Adding phi model HOT 5
- Please support Lux.jl HOT 7
- Example Code always produces Max Length Sequences
- how to download model weights on external drive
- Update to newer versions of dependencies
- Improve documentation and take inspiration from python package HOT 6
- please update compat bounds HOT 6
- Looking to update Transformers.jl and the associated modules HOT 1
- Storage of Downloaded Models from HuggingFace HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from transformers.jl.