Comments (10)
These may be helpful to you, depending on what you want:
finetune BERT with custom dataset #20
single sentence classifier #21
Do you want to finetune with your non-labeled dataset, or labeled dataset ?
If your dataset is labeled, is it pair-sentence, or single-sentence, or something else like part of speech tagging ?
Suppose your labeled dataset is pair sentences and label is binary, and what you want is to finetune some specific layers, then simply feed it into the model and make sure layer.trainable = True
from keras-bert.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
from keras-bert.
These may be helpful to you, depending on what you want:
finetune BERT with custom dataset #20
single sentence classifier #21Do you want to finetune with your non-labeled dataset, or labeled dataset ?
If your dataset is labeled, is it pair-sentence, or single-sentence, or something else like part of speech tagging ?Suppose your labeled dataset is pair sentences and label is binary, and what you want is to finetune some specific layers, then simply feed it into the model and make sure layer.trainable = True
AFAIK, finetune means you should load a trained model from a checkpoint and then train it again with your own dataset.
I've checked the lasted codes in master branch, and find that there are two params control the behavior of the core class method, one is trainable
and the other is training
. What's more, the method related to load trained model from checkpoint would set these two params coherent, the default values of which are False
.
So here is my question, to finetune the model, does it mean that I could simply put these two params to True
? What is the consequent influence and behavior of this change? Say how would the new model be saved or etc?
@CyberZHG would you please kindly make some response to above? Since we need assurance here.
Thanks in advance.
from keras-bert.
By setting trainable
to True
, you may get better results, with the cost of extra memories and time since you need to compute and remember the intermediate results and gradients of the whole model. While the model with setting trainable
to False
works just like a fixed dynamic embedding.
from keras-bert.
The saving is the same as how you save a Keras model. You don't need to care about the checkpoints because they are used only for initializing the weights.
from keras-bert.
@CyberZHG Thanks for reply immediately. That's great.
One more thing I want to reassure is, could I take this as a procedure of finetune when I set trainable
to True
? Am I right?
from keras-bert.
And don't forget there is a classification example: https://colab.research.google.com/github/CyberZHG/keras-bert/blob/master/demo/tune/keras_bert_classification_tpu.ipynb
from keras-bert.
@CyberZHG I will, great appreciated!
from keras-bert.
@CyberZHG Thanks for reply immediately. That's great.
One more thing I want to reassure is, could I take this as a procedure of finetune when I set
trainable
toTrue
? Am I right?
Actually, you can train the model with trainable=False
at the beginning, then set trainable=True
with a small lr for further tuning.
from keras-bert.
@CyberZHG Thanks for reply immediately. That's great.
One more thing I want to reassure is, could I take this as a procedure of finetune when I settrainable
toTrue
? Am I right?Actually, you can train the model with
trainable=False
at the beginning, then settrainable=True
with a small lr for further tuning.
So if I have to train a model from sketch, I should set trainable=False
;
while I have a trained model in hand, I could set trainable=True
to finetune it for specific tasks.
from keras-bert.
Related Issues (20)
- keras-bert load pre model failed in keras 2.2.4, maybe it is a bug
- could you provide an albert loader ? HOT 1
- 如何实现在自己的数据上很少的预训练? HOT 1
- 如何提取长度一致的bert词向量? HOT 1
- Input sentence for extract_embeddings HOT 1
- List index out of range for example code HOT 2
- 在哪里可以找到之前的版本呢? HOT 2
- Keras Bert with character embeddings HOT 1
- The method get_model to get the bert model is right? HOT 4
- API reference list
- which version of keras-bert is compatible for tensorflow 1.13? HOT 2
- 关于显存巨量占用的疑惑 HOT 1
- Improve Tokenizer for uppercase text HOT 1
- Error in getting attention map HOT 3
- Masked LM pre-training HOT 1
- AttributeError: module 'keras' has no attribute 'applications' HOT 5
- module 'tensorflow.compat.v2' has no attribute '__internal__' HOT 1
- load_trained_model_from_checkpoint(config_path,checkpoint_file=checkpoint_path) AttributeError: 'tuple' object has no attribute 'layer'
- load_trained_model_from_checkpoint(config_path,checkpoint_file=checkpoint_path)
- 'adamwarmup' object has no attribute '_set_hyper'
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from keras-bert.