armandds / bert_for_long_text Goto Github PK
View Code? Open in Web Editor NEWUsing BERT For Classifying Documents with Long Texts, check my latest post: https://armandolivares.tech/
Using BERT For Classifying Documents with Long Texts, check my latest post: https://armandolivares.tech/
/usr/local/lib/python3.6/dist-packages/bert/optimization.py in ()
85
86
---> 87 class AdamWeightDecayOptimizer(tf.train.Optimizer):
88 """A basic Adam optimizer that includes "correct" L2 weight decay."""
89
AttributeError: module 'tensorflow._api.v2.train' has no attribute 'Optimizer'
Tensorflow = 2.2.0
Tensorflow Hub = Version: 0.8.0
Hi, thanks for your work on this.
Now I want to replicate your implementation. Where are the data you used?
I actually downloaded a csv file from Kaggle, but found the total entries is different with your version.
Cos now the long-document classification model got very bad performance, loss is around 2,
accuracy is only 0.33. I don't know why. I used the same data pre-processing. Below is the output aftering training 10 epoch.
Epoch 10/10 4565/4571 [============================>.] - ETA: 0s - loss: 1.8058 - acc: 0.3544Epoch 1/10 120/4571 [..............................] - ETA: 19s - loss: 1.8386 - acc: 0.3463 Epoch 00010: ReduceLROnPlateau reducing learning rate to 8.573749619245064e-06. 4571/4571 [==============================] - 35s 8ms/step - loss: 1.8057 - acc: 0.3544 - val_loss: 1.8358 - val_acc: 0.3473
Below is the code for model creation.
def create_keras_model(dimention=None,
mask_value=-99.0,
lstm_dim=100,
dense_dim=30,
num_labels=2,
learning_rate=2e-5):
text_input = layers.Input(shape=(None,768,), dtype='float32', name='segments')
l_mask = layers.Masking(mask_value=mask_value)(text_input)
encoded_text = layers.LSTM(lstm_dim,)(l_mask)
out_dense = layers.Dense(dense_dim, activation='relu')(encoded_text)
out = layers.Dense(num_labels, activation='softmax')(out_dense)
model = tf.keras.Model(inputs=text_input, outputs=out, name="RoBERT")
if num_labels==2:
model.compile(optimizer=tf.keras.optimizers.RMSprop(learning_rate=learning_rate),
loss='binary_crossentropy',
metrics=['acc'])
else:
model.compile(optimizer=tf.keras.optimizers.Adam(learning_rate=learning_rate),
loss='sparse_categorical_crossentropy',
metrics=['acc'])
return model ```
Please advice and help
hello
def create_model(is_predicting, input_ids, input_mask, segment_ids, labels,
num_labels):
bert_module = hub.Module( # load yumu Module bn
BERT_MODEL_HUB, # end solison
trainable=True)
This part how i add my custom bert model instead of hub.module?
i mean i have Mongolian language bert. I think this bert is only for English?
BERT_MODEL_HUB = "https://tfhub.dev/google/bert_uncased_L-12_H-768_A-12/1"
`def get_split(text1)' seems to be buggy
when num_words = 201
number of blocks should be 2 ; [200 words, 51 words]
but the code gives only one block
In your code you have overlapped 50 tokens for each row of your data frame. But how does it contribute to the result as the title says BERT for long document. Because you have anyhow restricted the MSL to 200.
Can you please explain me the logic behind it ?
mport bert
from bert import run_classifier
from bert import optimization
from bert import tokenization
Where do the values 11, 187 come from in:
batch_size_val = 11
batches_per_epoch_val = 187
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.