Comments (17)
Thanks for the fix!
I tested udop-dual-large-224
with the code below (where I have downloaded model weights before), and it now works.
from core.models import UdopDualForConditionalGeneration, UdopUnimodelForConditionalGeneration, UdopConfig, \
UdopTokenizer
config = UdopConfig.from_pretrained("../udop-dual-large-224")
tokenizer = UdopTokenizer.from_pretrained("../udop-dual-large-224")
model = UdopDualForConditionalGeneration.from_pretrained("../udop-dual-large-224")
from i-code.
Yeah the configurations should be identical to t5 (tokenizer) or can be initialized by default parameters.
We will add our own configuration files when we add the model to Huggingface Transformer.
from i-code.
Thanks again for your great interest in our work, also we appreciate huggingface team's efforts. We'll fix the issue on our side ASAP.
from i-code.
@go2carter
You might have to add a few keys to the config manually. Check out the values from this config
This is from the PR to add the model to huggingface. Unless you are in a rush, it might be best to wait for the model to be added to huggingface. Everything will work much smoother then :)
from i-code.
@logan-markewich
Thank you! I'll try that out when I get the opportunity. Not in a rush, but I'm excited to try out this work :)
from i-code.
I have made a PR on fixing the issues and updated the model checkpoint to huggingface style checkpoints/configs. Let me know if the issue is still not addressed.
from i-code.
@raghavanone according to the README, the vision decoder weights will only be released as part of an azure API. Therefore, not exactly openly available.
from i-code.
Yes, running inference with the provided model weights is doable.
from i-code.
@zinengtang I am trying to add the model to huggingface here . But the model weights are incomplete with out vision decoder weights.
from i-code.
It's still unclear to me how to use the model.
I tried initializing as
import torch
from core.models import UdopDualForConditionalGeneration, UdopUnimodelForConditionalGeneration, UdopConfig, \
UdopTokenizer
MODEL_CLASSES = {
'UdopDual': (UdopConfig, UdopDualForConditionalGeneration, UdopTokenizer),
'UdopUnimodel': (UdopConfig, UdopUnimodelForConditionalGeneration, UdopTokenizer),
}
config = UdopConfig.from_pretrained("t5-large")
# also tried config = AutoConfig.from_pretrained("t5-large")
tokenizer = UdopTokenizer.from_pretrained("t5-large")
model = UdopUnimodelForConditionalGeneration.from_pretrained("t5-large")
# also tried model = UdopUnimodelForConditionalGeneration(config)
model.load_state_dict(
torch.load("models/UdopUnimodel-Large-224/pytorch_model.bin"))
which gives
AttributeError: 'T5Config' object has no attribute 'truncate_encoder_after_layer'
from i-code.
I fixed that by fixed the lines in "udop_dual.py" and "udop_unimodel.py" to:
if self.is_decoder:
self.num_layers = (
config.truncate_decoder_after_layer
if (hasattr(config, 'truncate_decoder_after_layer') and config.truncate_decoder_after_layer)
else config.num_layers
)
else:
self.num_layers = (
config.truncate_encoder_after_layer
if (hasattr(config, 'truncate_encoder_after_layer') and config.truncate_encoder_after_layer)
else config.num_layers
)
from i-code.
but then I run into the following error for both models:
AttributeError: 'UdopConfig' object has no attribute 'data_dir'
from i-code.
Right the current checkpoint doesn't contain the vision decoder part. We are still in the discussion of the most appropriate way of open sourcing that part.
from i-code.
It's still unclear to me how to use the model. I tried initializing as
import torch from core.models import UdopDualForConditionalGeneration, UdopUnimodelForConditionalGeneration, UdopConfig, \ UdopTokenizer MODEL_CLASSES = { 'UdopDual': (UdopConfig, UdopDualForConditionalGeneration, UdopTokenizer), 'UdopUnimodel': (UdopConfig, UdopUnimodelForConditionalGeneration, UdopTokenizer), } config = UdopConfig.from_pretrained("t5-large") # also tried config = AutoConfig.from_pretrained("t5-large") tokenizer = UdopTokenizer.from_pretrained("t5-large") model = UdopUnimodelForConditionalGeneration.from_pretrained("t5-large") # also tried model = UdopUnimodelForConditionalGeneration(config) model.load_state_dict( torch.load("models/UdopUnimodel-Large-224/pytorch_model.bin"))
which gives
AttributeError: 'T5Config' object has no attribute 'truncate_encoder_after_layer'
@zinengtang Can you take a look at this?
from i-code.
Right the current checkpoint doesn't contain the vision decoder part. We are still in the discussion of the most appropriate way of open sourcing that part.
I see. We should be able to run inference then still with the provided model weights--correct?
from i-code.
I've also been having trouble loading the model to test it out. When I tried, @maxjeblick 's code above with my fixes, then I get: AttributeError: 'T5Config' object has no attribute 'max_2d_position_embeddings'
from i-code.
awesome! Excited to try it out when I get a chance
from i-code.
Related Issues (20)
- from core.common.utils import img_trans_torchvision, get_visual_bbox Module not found error
- layout token unkown HOT 1
- special vis token
- Image loading in dataloader code HOT 5
- Img2txt result is pretty bad on 16bit HOT 4
- Img2Img Broken? HOT 1
- Trainig pipeline of CoDi in i-Code-V3 HOT 6
- i-Code-V3: How could I implement training tasks in i-Code-V3?
- Can you provide classifier-free guidance probability? HOT 1
- Cuda out of memory HOT 1
- In 'Finetuninng on RVLCDIP', which one is the dataset ? HOT 1
- Generating bounding boxes with UDOP HOT 7
- i-Code studio access?
- Run i-Code-v3 on CPU, Solve GPU VRAM problem! HOT 2
- What dose text characters embeddings initialization weights use in UDOP pretraining? HOT 7
- Language Support HOT 2
- Document Classification Mertic (UDOP) HOT 2
- Is the pretrained mae encoder weights available ? HOT 2
- Inference VRAM requirements? HOT 2
- Data Collator Incorrect When Using a Decoder Prefix
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from i-code.