layer6ai-labs / sgg-seq2seq Goto Github PK
View Code? Open in Web Editor NEWCode for the ICCV'21 paper "Context-aware Scene Graph Generation with Seq2Seq Transformers"
Code for the ICCV'21 paper "Context-aware Scene Graph Generation with Seq2Seq Transformers"
I can't find preprocess_evaluation.py and write_prediction.py, can u provide them, please?
Thank you for your code. when I run this repo, some issues confused me.
for code https://github.com/layer6ai-labs/SGG-Seq2Seq/blob/main/transformer.py#L238 "relation_ids" contain relation label such as tensor([[ 0, 48, 31, 48, 31, 31, 22, 31, 48, 48, 22], device='cuda:0'), "relation_ids" will embedded using nn.Embedding and fed into TripletEncoder. But, I think "relation_ids" are groundtruth, may be I'm wrong.
I run the repo, python trainer.py --dataset vg --batch_size 512 --nhead 4
and get:
But, I find this repo doesn't use the visual feature of object regions?
For union features in this repo called pairwise_features just contain box feature in https://github.com/layer6ai-labs/SGG-Seq2Seq/blob/main/feature_utils.py#L65. Have you considered using visual union features like motif UnionBoxesAndFeats. Thanks.
where is the RL strategy's code?
There seems some bugs in this repo. For instance, the parameter of nn.TransformerEncoderLayer d_model in line 30 of transformer.py is the number of expected features in the input which must be divisible by num_heads. But in this repo, the d_model is set as NUM_BOX_FEATURES 109, and num_heads is 4. (109 % 4 eq 1)
tgt_key_padding_mask cannt match with tgt_mask.
Epoch 1 Recall@5 0.40540524032301645 Recall@10 0.5550042311638902 Recall@20 0.691875286804221 Recall@50 0.7947694177616892 Recall@100 0.82678933109488...
Traceback (most recent call last):
File "D:/xuexi/graduate/project/SGG-Seq2Seq-main/trainer.py", line 406, in
main()
File "D:/xuexi/graduate/project/SGG-Seq2Seq-main/trainer.py", line 322, in main
predictions = model(
File "D:\software\anaconda\envs\pytorch1.8\lib\site-packages\torch\nn\modules\module.py", line 1130, in _call_impl
return forward_call(*input, **kwargs)
File "D:\xuexi\graduate\project\SGG-Seq2Seq-main\transformer.py", line 92, in forward
decoder_states = self.decoder(
File "D:\software\anaconda\envs\pytorch1.8\lib\site-packages\torch\nn\modules\module.py", line 1130, in _call_impl
return forward_call(*input, **kwargs)
File "D:\software\anaconda\envs\pytorch1.8\lib\site-packages\torch\nn\modules\transformer.py", line 291, in forward
output = mod(output, memory, tgt_mask=tgt_mask,
File "D:\software\anaconda\envs\pytorch1.8\lib\site-packages\torch\nn\modules\module.py", line 1130, in _call_impl
return forward_call(*input, **kwargs)
File "D:\software\anaconda\envs\pytorch1.8\lib\site-packages\torch\nn\modules\transformer.py", line 576, in forward
x = self.norm1(x + self._sa_block(x, tgt_mask, tgt_key_padding_mask))
File "D:\software\anaconda\envs\pytorch1.8\lib\site-packages\torch\nn\modules\transformer.py", line 585, in _sa_block
x = self.self_attn(x, x, x,
File "D:\software\anaconda\envs\pytorch1.8\lib\site-packages\torch\nn\modules\module.py", line 1130, in _call_impl
return forward_call(*input, **kwargs)
File "D:\software\anaconda\envs\pytorch1.8\lib\site-packages\torch\nn\modules\activation.py", line 1153, in forward
attn_output, attn_output_weights = F.multi_head_attention_forward(
File "D:\software\anaconda\envs\pytorch1.8\lib\site-packages\torch\nn\functional.py", line 5155, in multi_head_attention_forward
assert key_padding_mask.shape == (bsz, src_len),
AssertionError: expecting key_padding_mask shape of (2, 100), but got torch.Size([2, 78])
Process finished with exit code 1
Hi ,sg_dataset.zip URL can't open, is there any other way to download it?
thanks!
Are you going to provide pretrained models for your method (Transformer)?
Hi, when I run preprocess_evaluation.py, 'data/detection.txt' can't be found, how can I get this file? thanks!
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.