xuewyang / joganic Goto Github PK

Python 99.52% Shell 0.48%

joganic's Introduction

JoGANIC

A repo for JoGANIC - Journalistic Guidelines Aware News Image Captioning, accepted on EMNLP 2021. This is a non-official repo. Please refer to the official repo https://github.com/dataminr-ai/joganic

"Journalistic Guidelines Aware News Image Captioning" by Xuewen Yang, Svebor Karaman, Joel Tetreault, and Alex Jaimes.

@inproceedings{yang-etal-2021-journalistic,
    title = "Journalistic Guidelines Aware News Image Captioning",
    author = "Yang, Xuewen  and
      Karaman, Svebor  and
      Tetreault, Joel  and
      Jaimes, Alejandro",
    booktitle = "Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing",
    year = "2021",
    pages = "5162--5175",
}

joganic's People

Contributors

Watchers

joganic's Issues

the reproduce metrics

Hi, I've figured out the apex problem ! I can train the joganic model for whole 6.6 million sample in NYTimes now. I follow the config in tell model, and I check my data and label which are right. I got 5.57(BLEU), 20.00(Rouge), 46.33(CIDEr) at last. It seems a lot different from paper which is 6.39(BLEU), 22.38(Rouge), 56.54(CIDEr).
my main config is:

lr 0.0001
t_total 437600
num_epoch 100
eval_limit 5120

Sorry to bother you all the time, if you know how to reproduce the metrics better, please give me some advice which could help. It is really important to my research now, and I'll appreciated a lot if I can contain this model in my ablation experiment.
May I have your email and know more about the model ?

apex Gradient overflow.

I've trained joganic model without apex with small bsz. But it takes long time and only get 4.64 Belu. So I guess I have to train model with apex.
But every time I use apex, it always cause Gradient overflow after 50~55 epoch.

Gradient overflow.  Skipping step, loss scaler 0 reducing loss scale to 0.5
Gradient overflow.  Skipping step, loss scaler 0 reducing loss scale to 0.25
Gradient overflow.  Skipping step, loss scaler 0 reducing loss scale to 0.125
Gradient overflow.  Skipping step, loss scaler 0 reducing loss scale to 0.0625
Gradient overflow.  Skipping step, loss scaler 0 reducing loss scale to 0.03125
Gradient overflow.  Skipping step, loss scaler 0 reducing loss scale to 0.015625
Gradient overflow.  Skipping step, loss scaler 0 reducing loss scale to 0.0078125
Gradient overflow.  Skipping step, loss scaler 0 reducing loss scale to 0.00390625
Gradient overflow.  Skipping step, loss scaler 0 reducing loss scale to 0.001953125
Gradient overflow.  Skipping step, loss scaler 0 reducing loss scale to 0.0009765625
Gradient overflow.  Skipping step, loss scaler 0 reducing loss scale to 0.00048828125

And then I got nan loss for caption, I don't change any code from github, and I use the file in scripts to obtain data, I use the same config with Transform-and-tell. I checked the data is clean. Have you met the same problem during training stage? Could you give me some advice to fix it?

does the model train with oracle labels or generated labels?

since I don't have the config file, I guess I need to set oracle when the model train from scratch. Will that be helpful to model training? Thanks a lot in advance.

about other files to train the model

It is a great model for news image captioning! I'm doing research on this task too. I want to explore your model more, could you please send me the config.yaml? What's more, I don't understand how to obtain the classes_test.npy and nerts_test.json, I'll be appreciated if you can tell me the way to obtain them and how to train the model.

could I get the model weight?

I tried the config the same as transform and tell, but with apex it get nan loss. could I have the origin config.yaml or the trained wight? I wish I could get this good performance model involved in my research. Thanks previously!

Json output of the JoGANIC model

Hello,

Great work on the paper! We are working on our paper to compare to yours. Thus, if you could provide us the json output file for your models, it would be great!

Thanks.

xuewyang / joganic Goto Github PK

joganic's Introduction

JoGANIC

joganic's People

Contributors

Watchers

joganic's Issues

the reproduce metrics

apex Gradient overflow.

does the model train with oracle labels or generated labels?

about other files to train the model

could I get the model weight?

Json output of the JoGANIC model

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent