Comments (4)
Hi, thanks for pointing this out! This is a typo and it should be num_tokens
-- I think this was an artifact left from early in development. I believe 8216fea should fix this, but please let me know if not.
from gill.
That's what I was thinking too :)
Also, loading a new trained model, for training the decision classifier fails in the "TrainDecisionClassifier" fails too. It is due to the assertion error on line 45 of models.py, saying:
AttributeError: 'args' object has no attribute 'text_emb_layers'
There are 3 arguments that are not saved in the model_args.json file (compared to provided json file in the repo) after conducting a training:
"text_emb_layers": [
-1
],
"share_ret_gen": true,
"norm_image_embed": "none"
I think this is the reason.
P.s: I am just trying to train the model with a different LLM and rerunning all the scripts
from gill.
Thanks! This is helpful. The other arguments are not used, but text_emb_layers
is, so we need it to be saved in the model_args.json file. Oddly enough it doesn't seem to be saved here, even though it's part of model_args. I guess we can explicitly add it, since there's no command line flag to change text_emb_layers
in main.py: 6b183ac
Sorry about that, text_emb_layers
was something used for debugging and I didn't remove it completely in the final version. I think with your newly trained LLM, you can also edit the .json to set "text_emb_layers": [-1]
and it should work fine. Hope that helps!
from gill.
Of course :) adding "text_emb_layers": [-1]
explicitly worked actually, thanks!
I see that the "share_ret_gen" is used here , but I guess it's not affecting anything.
thanks anyways.
from gill.
Related Issues (20)
- Clarification on precomputing the visual embeddings HOT 1
- How to get cc3m_embeddings HOT 1
- About the running log HOT 4
- Normalization of cc3m features HOT 1
- How could this affect the performance? HOT 10
- About error when running Precomputing Text Embeddings and Train HOT 2
- shape mismatch in the example notebook HOT 2
- [solved]
- why don't you use universal representation in one task?
- GILL Image Retrieval Code on VIST HOT 1
- Inference shape is not 8 HOT 1
- Visdial相关问题
- Error size mismatch when load decision model HOT 2
- RuntimeError: CUDA error: no kernel image is available for execution on the device CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect. For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
- param.grad is None !
- shape mismatch in the example "Multimodal Dialogue" HOT 1
- FID Evaluation on CC3M and VIST
- i try to dowmload cc3m using tools recommand by readme.md, but the number of picture can be download only 10% . is it normal?
- about [img] token and train data
- environment conflict
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from gill.