Hi Apoorv, nice work. I have some issue about the QA-fine-tuning. I experimented w

Hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

Also, I would suggest you take a look at <a class="issue-link js-issue-link" data-erro

Question about QA-fine-tuning about kgt5 HOT 7 CLOSED

czh17 commented on August 26, 2024

Question about QA-fine-tuning

from kgt5.

Comments (7)

apoorvumang commented on August 26, 2024

Hi @czh17 , thanks for your interest.

Can you give more details on how you trained/pretrained? i.e. the exact commands you ran + dataset processing done.

For results reported in the paper, you need to pretrain on the MetaQA KG, not on Wikidata5M. Subsequently you need to finetune on the QA dataset

from kgt5.

czh17 commented on August 26, 2024

Thank you for your reply.

For results reported in the paper, you need to pretrain on the MetaQA KG, not on Wikidata5M. Subsequently you need to finetune on the QA dataset

Yes, I realized the problem you suggested, so I reproduced the whole experiment again. However, the experiment did not work very well. The details of the experiment and the exact commands are as follows.

For KGC pretrain on the MetaQA KG :

dataset: ‘data_kgqa//MetaQA_1hop_half//train_kgc_lines.txt' (only)
optimizer: adafactor
learning_rate: 1e-4
epoch: 200
model_size: small

In this stage, I use main_accelerate.py under the main branch for training. I observed that the loss of the model did not decrease and would appear to go from small to large and then to small again. For example, (epoch loss: 100->500->2000->90->400). I set the learning rate to 1e-5 as well as 1e-6, but the problem does not seem to be alleviated.

For KBQA fine-tuning on the MetaQA :

dataset: f‘data_kgqa//MetaQA_{hops}hop_half//train.txt' (hops = [1,2,3]) and 'qa_test.txt'
optimizer: adafactor
learning_rate: 1e-4
epoch: 60
pointcheck: (The kgc model with the smallest loss).pt
beam_size: 1

In this stage, I also use main_accelerate.py under the main branch for training. For inference, the qa pair in qa_test.text, is converted to the form ' predict answer: Topic Entity token | question token with NE |/t answer token '. Meanwhile, I rewrote the eval function based on eval_accelerate.py under the apoorv-dump branch, whose evaluation criterion is that if the token generated by the model is in the answer list, then the answer is judged to be correct.

Please let me know if there are any mistakes or details that I should have noticed in the above training process. Thanks again for your reply.

from kgt5.

apoorvumang commented on August 26, 2024

Hmm, seems weird that loss fluctuates like that. Can you please post the exact commands you executed?

from kgt5.

apoorvumang commented on August 26, 2024

Also, I would suggest you take a look at #11 as well, for details on how you can train the model in 1 go (concatenating qa and kgc lines).

I will try to post the pretrained checkpoints as well soon

from kgt5.

czh17 commented on August 26, 2024

Hmm, seems weird that loss fluctuates like that. Can you please post the exact commands you executed?

Yes, this loss fluctuation phenomenon is very confusing to me. Here are the commands I executed :

python main_accelerate.py --save_prefix MetaQA_kgc_200_epoch --model_size base --dataset data_kgqa/MetaQA_1hop_half --split train_kgc_lines --batch_size 64 --save_steps 5000 --loss_steps 500 --learning_rate 0.0001

In this experiment, I have changed line 139 of main_accelerate.py to T5ForConditionalGeneration.from_pretrained('t5-base').

from kgt5.

apoorvumang commented on August 26, 2024

Let me try and get back to you, sorry for the delay

from kgt5.

czh17 commented on August 26, 2024

Would you mind sharing the code for the KBQA fine-tuning? This is very important for my research work, thanks again.

from kgt5.

Question about QA-fine-tuning about kgt5 HOT 7 CLOSED

Comments (7)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent