<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

The saved model: pytorch_model.bin is still the original pretrained model? about spert HOT 11 CLOSED

lavis-nlp commented on July 28, 2024

The saved model: pytorch_model.bin is still the original pretrained model?

from spert.

Comments (11)

markus-eberts commented on July 28, 2024

Hi,
this should not be the case and I currently do not have any explanation for this. The code snippet you posted is fine. Why do you think it is still the pretrained model? And can you post the library versions and the configuration you used?

from spert.

victorbai2 commented on July 28, 2024

@markus-eberts Thanks for your response. I firstly check the model size in the data/save/...directory, and found that the pytorch_model.bin is the same size as the one for pretrained model downloaded(413M), and then I evaluated the both models with evaluation dataset, and the result is the same.

The configuration and other things are the same.

from spert.

victorbai2 commented on July 28, 2024

@markus-eberts please find the below example that I tested in google colab.

Spert.ipynb.zip

from spert.

markus-eberts commented on July 28, 2024

I just updated the repository (some changes due to upgrade to new 'transformers' version) and requirements.txt. Model saving works fine on my side. Could you please pull the newest changes, use the libraries in requirements.txt and try again?

from spert.

victorbai2 commented on July 28, 2024

@markus-eberts Hi I implemented the changes to google colab, but the result is unfortunately the same and the weight is not saved, the epoch I set for testing is 3. I checked the size of pytorch_model.bin that is saved in dir /save/data..../finial_model.

Is that the same in your end? I wonder if the code: model.save_pretrained(dir_path) only saves the pretrained weight as the name reflects.

BTW, I even ran " python ./spert.py eval --config configs/example_eval.conf " with the purely pretrained_weight (pytorch_model.bin), surprisingly, it can be evaluated? Where all other layers or weights used after cls layers?

from spert.

victorbai2 commented on July 28, 2024

@markus-eberts I think I understand it now, I used the pytorch_model.bin that has been trained by you and I downloaded in dir: data/model/pytorch_model.bin.

But one thing seems strange for me is why the trained saved model(pytorch_model.bin) is the same size of original pretrained model. After training, should not the model become much larger just as the one for tensorflow？

from spert.

markus-eberts commented on July 28, 2024

Is that the same in your end? I wonder if the code: model.save_pretrained(dir_path) only saves the pretrained weight as the name reflects.

The 'save_pretrained' method of 'transformers' definitely saves the whole model. I use the library alot and it's also stated in the documentation.

But one thing seems strange for me is why the trained saved model(pytorch_model.bin) is the same size of original pretrained model

I'm not sure if you are comparing with the CoNLL04 model provided by us or the bert-base-cased model downloaded via 'transformers' library. The CoNLL04 'pytorch_model.bin' trained by us is already finetuned on the task of joint entity and relation extraction. So it should roughly match the size of your trained model and give good evaluation results. Regarding the bert-base-cased model (MLM pre-trained, but not finetuned on the target task), I also do not expect a large size difference to a finetuned model, since we only add shallow (relative to BERT) linear layers.

from spert.

victorbai2 commented on July 28, 2024

@markus-eberts I compared the trained model from you and the bert-base-case, the size of those two is the same.

BTW, is all the code written by yourself? It is very high quality code.

from spert.

markus-eberts commented on July 28, 2024

I compared the trained model from you and the bert-base-case, the size of those two is the same.

This is reasonable. When you use your trained model for evaluation (e.g. 'python ./spert.py eval --config configs/example_eval.conf' and set model_path/tokenizer_path to your model) it should give you similar results as on the validation dataset (as outputted after training). In this case, everything works as expected and the model was saved correctly.

BTW, is all the code written by yourself? It is very high quality code.

Yes and thank you. I try my best to make the code 'readable' and easy to follow. However, since this is just the code accompanying a research paper, its main purpose is to reproduce our evaluation results. I often wish to have done some code parts better (from a software architectural point of view) but lacked the time to do so. After all, the next paper deadline is usually right around the corner ;). Of course I'm glad that the code and the SpERT model itself is useful for the research community and beyond.

from spert.

victorbai2 commented on July 28, 2024

@markus-eberts you are really productive. If you would like to read your next paper once it is published.

from spert.

markus-eberts commented on July 28, 2024

@markus-eberts you are really productive. If you would like to read your next paper once it is published.

Thanks.

from spert.

The saved model: pytorch_model.bin is still the original pretrained model? about spert HOT 11 CLOSED

Comments (11)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent