yuleiniu / rva Goto Github PK
View Code? Open in Web Editor NEWCode for CVPR'19 "Recursive Visual Attention in Visual Dialog"
License: Other
Code for CVPR'19 "Recursive Visual Attention in Visual Dialog"
License: Other
Under Extracting Features (Optional), you have this step:
Prepare the MSCOCO and Flickr images.
Any preprocessing step required after that?
Thanks for your help :)
Hi, thank you for sharing you code.
In your paper, You wrote that rva model obtain one-hot vector using Eq. 19. Strictly speaking, I think it is not correct.
Rva model obtain one-hot vector one_hot(argmax(Eq. 20)) not Eq. 19.
In other words, softmax operation is always used regardless of the argument (soft / hard).
For more details, please refer to the F.gumbel_softmax code as below:
https://pytorch.org/docs/stable/_modules/torch/nn/functional.html
Hi Yulei,
Thanks for your sharing of this copy of code. I am running the code as a baseline. In the Glove embedding initialization step,
python data/init_glove.py
I think the file 'init_glove.py' is missing.
Hi,
I find the distribution, i.e. pi in Eq. (19) and Eq. (20) should use log before added to g. However, I have checked your codes, I find you do not use log on pi. Does it matter?
In rva/visdialch/data/readers.py", line 191 there is no self.loc_feats.
Hi,
Did you also face this parallelization error while training? This seems to be strange since I am using in fact only one gpu-id!
File " /home/shubham/rva/visdialch/encoders/modules.py", line 233, in forward
ques_prob_refine = torch.bmm(ques_gs[:, i, :].view(-1, 1, 2), ques_prob).view(-1, 1, 2) # shape: [batch_size, num_rounds, 2]
RuntimeError: arguments are located on different GPUs at /pytorch/aten/src/THC/generic/THCTensorMathBlas.cu:479
Any suggestions please
Hello, i want to ask about the image features which are extracted from the detectron. When i extract them, i find their shapes are (36,1024), but the features' shapes which are extracted by batra-mlp-lab are (36,2048). So my question is how to deal with this problem? Thanks.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.