allenai / x-lxmert Goto Github PK
View Code? Open in Web Editor NEWPyTorch code for EMNLP 2020 paper "X-LXMERT: Paint, Caption and Answer Questions with Multi-Modal Transformers"
Home Page: https://prior.allenai.org/projects/x-lxmert
PyTorch code for EMNLP 2020 paper "X-LXMERT: Paint, Caption and Answer Questions with Multi-Modal Transformers"
Home Page: https://prior.allenai.org/projects/x-lxmert
Hi!
In the README there are references to scripts for finetuning and testing the different downstream tasks. They should be located in the x-lxmert/scripts/
folder, but don't seem to be there. Is it possible to add them? It would also be nice if the fine-tuned models could be shared to be able to reproduce the results.
Thanks :)
This is a cool tool, and I really enjoy the images I've gotten from the Demo.
I was hoping one of two things were possible, and I'm wondering if I'm just missing something basic. First, is there a web API? It'd be amazing to be able to do something similar to:
result = requests.get('https://vision-explorer.allenai.org/text_to_image_generation_api', data={'caption': 'Diamond rose horse'})
save_image(result.json()['image_str'])
Or, (and it looks like this is to something that I might actually be able to do) if there's no web API:
pip install -r requirements.txt
wget -O image_generator/snap/pretrained/G_60.pth https://ai2-vision-x-lxmert.s3-us-west-2.amazonaws.com/image_generator/G_60.pth
./image_generator/scripts/make_image.py --caption "Diamond rose horse" --outpath my_weird_pic.png
I guess my question is: if all I care about is having a caption and getting an image programmatically, what's the easiest way of doing that?
Could the maskrcnn extract fixed bouding box features, I have the coordinates, I want to extract features.
Hi, thanks for this wonderful work! When I try to run the image generation training, it appears that trainer.py
is missing from the src. Could you add this missing file to the repo? Thanks a lot for your kind help!
Thanks for sharing the code! Could you check the download links for grid features?
$ wget https://ai2-vision-x-lxmert.s3-us-west-2.amazonaws.com/butd_features/NLVR2/maskrcnn_train_grid8.h5
--2021-01-08 14:57:08-- https://ai2-vision-x-lxmert.s3-us-west-2.amazonaws.com/butd_features/NLVR2/maskrcnn_train_grid8.h5
Resolving ai2-vision-x-lxmert.s3-us-west-2.amazonaws.com (ai2-vision-x-lxmert.s3-us-west-2.amazonaws.com)... 52.218.153.17
Connecting to ai2-vision-x-lxmert.s3-us-west-2.amazonaws.com (ai2-vision-x-lxmert.s3-us-west-2.amazonaws.com)|52.218.153.17|:443... connected.
HTTP request sent, awaiting response... 403 Forbidden
2021-01-08 14:57:08 ERROR 403: Forbidden.
Seems like this might not be the original code used for the paper - as I see quite a few bugs here ranging from typos/syntax errors to different file structures than the instructions (specifically for image generation). Would be nice if the authors @j-min actually can verify if this is it and it works for them?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.