Comments (1)
Hi, @sabetAI.
Sorry for the slow reply.
We had not faced such a problem in our experiments. (after some number of updates model start to produce other tags as well)
I think that the following could be helpful.
- Exclude true negatives from the data (tn_prob=0) during the pretraining stage.
- Use bigger batch_size (at least 128; better 256).
- Use more data if possible.
- Freeze encoder weights during the first couple of epochs (cold_step_count in [2,4])
Additionally, you could modify the mask
in order to make weights for KEEP operation lower.
Something like this:
keep_bias = -0.5
weights = (labels == self.keep_index).long() * (keep_bias) + mask
loss_labels = sequence_cross_entropy_with_logits(logits_labels, labels, weights, label_smoothing=self.label_smoothing)
I hope that this will be useful to you.
from gector.
Related Issues (20)
- What should I update if I want to do distributed training? HOT 1
- How to fine tune this gector model for our data of interest? Kindly let me know HOT 1
- How to evaluate the `output_file` using `m2scorer` and `errant` HOT 3
- trained gector for usage HOT 4
- bash file as example HOT 3
- data preprocessing HOT 7
- preprocessing data question
- Conversion from m2 to parallel HOT 5
- stage 2 training problem HOT 2
- stage 2 training data problem HOT 3
- Reproducing experiments and finding different scores after Stage 1
- What is special_tokens_fix doing? HOT 4
- TypeError: 'type' object is not subscriptable
- What's the max_len in prediction?
- some detail of gector-large HOT 2
- Are dev/test sets used for training?
- Using GECTOR model for arabic HOT 2
- Running environment
- Data/output Structure
- Can't make the pretrained model work, even after looking at previous issues HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from gector.