Giter Club home page Giter Club logo

Comments (7)

apoorvumang avatar apoorvumang commented on August 26, 2024

Hi @czh17 , thanks for your interest.

Can you give more details on how you trained/pretrained? i.e. the exact commands you ran + dataset processing done.

For results reported in the paper, you need to pretrain on the MetaQA KG, not on Wikidata5M. Subsequently you need to finetune on the QA dataset

from kgt5.

czh17 avatar czh17 commented on August 26, 2024

Thank you for your reply.

For results reported in the paper, you need to pretrain on the MetaQA KG, not on Wikidata5M. Subsequently you need to finetune on the QA dataset

Yes, I realized the problem you suggested, so I reproduced the whole experiment again. However, the experiment did not work very well. The details of the experiment and the exact commands are as follows.

For KGC pretrain on the MetaQA KG :

  • dataset: ‘data_kgqa//MetaQA_1hop_half//train_kgc_lines.txt' (only)
  • optimizer: adafactor
  • learning_rate: 1e-4
  • epoch: 200
  • model_size: small

In this stage, I use main_accelerate.py under the main branch for training. I observed that the loss of the model did not decrease and would appear to go from small to large and then to small again. For example, (epoch loss: 100->500->2000->90->400). I set the learning rate to 1e-5 as well as 1e-6, but the problem does not seem to be alleviated.

For KBQA fine-tuning on the MetaQA :

  • dataset: f‘data_kgqa//MetaQA_{hops}hop_half//train.txt' (hops = [1,2,3]) and 'qa_test.txt'
  • optimizer: adafactor
  • learning_rate: 1e-4
  • epoch: 60
  • pointcheck: (The kgc model with the smallest loss).pt
  • beam_size: 1

In this stage, I also use main_accelerate.py under the main branch for training. For inference, the qa pair in qa_test.text, is converted to the form ' predict answer: Topic Entity token | question token with NE |/t answer token '. Meanwhile, I rewrote the eval function based on eval_accelerate.py under the apoorv-dump branch, whose evaluation criterion is that if the token generated by the model is in the answer list, then the answer is judged to be correct.

Please let me know if there are any mistakes or details that I should have noticed in the above training process. Thanks again for your reply.

from kgt5.

apoorvumang avatar apoorvumang commented on August 26, 2024

Hmm, seems weird that loss fluctuates like that. Can you please post the exact commands you executed?

from kgt5.

apoorvumang avatar apoorvumang commented on August 26, 2024

Also, I would suggest you take a look at #11 as well, for details on how you can train the model in 1 go (concatenating qa and kgc lines).

I will try to post the pretrained checkpoints as well soon

from kgt5.

czh17 avatar czh17 commented on August 26, 2024

Hmm, seems weird that loss fluctuates like that. Can you please post the exact commands you executed?

Yes, this loss fluctuation phenomenon is very confusing to me. Here are the commands I executed :

python main_accelerate.py --save_prefix MetaQA_kgc_200_epoch --model_size base --dataset data_kgqa/MetaQA_1hop_half --split train_kgc_lines --batch_size 64 --save_steps 5000 --loss_steps 500 --learning_rate 0.0001

In this experiment, I have changed line 139 of main_accelerate.py to T5ForConditionalGeneration.from_pretrained('t5-base').

from kgt5.

apoorvumang avatar apoorvumang commented on August 26, 2024

Let me try and get back to you, sorry for the delay

from kgt5.

czh17 avatar czh17 commented on August 26, 2024

Would you mind sharing the code for the KBQA fine-tuning? This is very important for my research work, thanks again.

from kgt5.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.