Hi, First of all Great paper! Lately, I have been doing abstractive summarizat

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Finetuning Loss not decreasing on Custom Summarization Task [Help wanted] about pegasus HOT 4 CLOSED

google-research commented on August 17, 2024

Finetuning Loss not decreasing on Custom Summarization Task [Help wanted]

from pegasus.

Comments (4)

JingqingZ commented on August 17, 2024

Hi, thanks for the question! Could you elaborate on what do you mean by

The loss is not decreasing after this point and not converging or stuck to local minima.

Does the loss ever decrease in the first 20 epoch? What is the ROUGE score you have achieved by fine-tuning on PEGASUS and T5?

Any plans to release PEGASUS_base

Sorry, there is currently no plan to release base models due to checkpoints incompatibility.

from pegasus.

rohitsroch commented on August 17, 2024

Does the loss ever decrease in the first 20 epoch? What is the ROUGE score you have achieved by fine-tuning on PEGASUS and T5?

@JingqingZ Apologies for the confusion. Yes, the loss decreases smoothly for the first 15-20 epochs but it doesn't converge. Below is the reference plot during training with learning rate 2e-4.

If you check after 5k steps (15 epochs), loss value slightly changes or almost constant (~1.5). Also, I tried training for a further 5 epochs with an increased learning rate to 2e-3, but it diverged and became stable to same value (~1.5). Any thoughts on what should I do?
Also tried training the model using a triangular Cyclical learning rate policy but the same behavior occurs.
Although, results/summary of the evaluation set is good. Below are the ROUGE score for PEGASUS (large) and T5 (small) using following decoding params for both

  beam_size = 1
  top_p = 0.95
  top_k = 50
  temperature=0.5

NOTE: Below scores are Average across 78 datapoints in eval set

PEGASUS_large

	ROUGE-1	ROUGE-2	ROUGE-L
precision	0.493	0.237	0.368
recall	0.532	0.263	0.403
fmeasure	0.486	0.237	0.365

T5_small

	ROUGE-1	ROUGE-2	ROUGE-L
precision	0.507	0.211	0.363
recall	0.443	0.189	0.322
fmeasure	0.455	0.192	0.329

I didn't use Beam search algorithm for decoding as it taking lot of time to decode (with a beam size of 5) for each input on n1-standard VM 8v CPUs

from pegasus.

JingqingZ commented on August 17, 2024

Hi, thanks for the information!

I think the overall performance (given the learning curve and ROUGE scores) of PEGASUS looks reasonable so I don't think there is anything wrong in there. But apparently it can be improved by tuning some hyper-parameters, which need some empirical experiments.

the loss decreases smoothly for the first 15-20 epochs but it doesn't converge. Below is the reference plot during training with learning rate 2e-4.

It seems the loss is still decreasing and the fine-tuning may need more steps. In our paper Appendix C, we provide a full table of hyper-parameters we used to fine-tune each dataset and most of them have more fine-tuning steps (and possibly larger batch size) than yours. The learning rate can be smaller as well if the fluctuation of loss persists.

Considering the relatively small eval set with 78 examples, some slight fluctuation of loss on the eval set is possible.

I didn't use Beam search algorithm for decoding

Beam search actually can improve the ROUGE quite significantly for a couple of points.

Hope this may answer your questions!

from pegasus.

rohitsroch commented on August 17, 2024

@JingqingZ, Thanks a lot for quick help, I will check Appendix C in paper :). Closing this issue!

from pegasus.

Finetuning Loss not decreasing on Custom Summarization Task [Help wanted] about pegasus HOT 4 CLOSED

Comments (4)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent