armandds / bert_for_long_text Goto Github PK

View Code? Open in Web Editor NEW

40.0 2.0 29.0 64 KB

Using BERT For Classifying Documents with Long Texts, check my latest post: https://armandolivares.tech/

Home Page: https://medium.com/@armandj.olivares/using-bert-for-classifying-documents-with-long-texts-5c3e7b04573d

Jupyter Notebook 100.00%

bert_for_long_text's Introduction

Repository for my medium's article: "Using BERT for Classifying Documents with long Text"

bert_for_long_text's People

Contributors

Stargazers

Watchers

bert_for_long_text's Issues

AttributeError: module 'tensorflow._api.v2.train' has no attribute 'Optimizer'

/usr/local/lib/python3.6/dist-packages/bert/optimization.py in ()
85
86
---> 87 class AdamWeightDecayOptimizer(tf.train.Optimizer):
88 """A basic Adam optimizer that includes "correct" L2 weight decay."""
89

AttributeError: module 'tensorflow._api.v2.train' has no attribute 'Optimizer'

Tensorflow = 2.2.0
Tensorflow Hub = Version: 0.8.0

Can't replicate the results

Hi, thanks for your work on this.

Now I want to replicate your implementation. Where are the data you used?
I actually downloaded a csv file from Kaggle, but found the total entries is different with your version.
Cos now the long-document classification model got very bad performance, loss is around 2,
accuracy is only 0.33. I don't know why. I used the same data pre-processing. Below is the output aftering training 10 epoch.
Epoch 10/10 4565/4571 [============================>.] - ETA: 0s - loss: 1.8058 - acc: 0.3544Epoch 1/10 120/4571 [..............................] - ETA: 19s - loss: 1.8386 - acc: 0.3463 Epoch 00010: ReduceLROnPlateau reducing learning rate to 8.573749619245064e-06. 4571/4571 [==============================] - 35s 8ms/step - loss: 1.8057 - acc: 0.3544 - val_loss: 1.8358 - val_acc: 0.3473

Below is the code for model creation.

def create_keras_model(dimention=None,
                        mask_value=-99.0,
                        lstm_dim=100,
                        dense_dim=30,
                        num_labels=2,
                        learning_rate=2e-5):
    text_input = layers.Input(shape=(None,768,), dtype='float32', name='segments')
    l_mask = layers.Masking(mask_value=mask_value)(text_input)
    encoded_text = layers.LSTM(lstm_dim,)(l_mask)
    out_dense = layers.Dense(dense_dim, activation='relu')(encoded_text)
    out = layers.Dense(num_labels, activation='softmax')(out_dense)
    model = tf.keras.Model(inputs=text_input, outputs=out, name="RoBERT")
    if num_labels==2:
        model.compile(optimizer=tf.keras.optimizers.RMSprop(learning_rate=learning_rate),
              loss='binary_crossentropy',
              metrics=['acc'])
    else:
    
        model.compile(optimizer=tf.keras.optimizers.Adam(learning_rate=learning_rate),
              loss='sparse_categorical_crossentropy',
              metrics=['acc'])  
    return model ```

Please advice and help

load local custom bert

hello

def create_model(is_predicting, input_ids, input_mask, segment_ids, labels,
num_labels):

bert_module = hub.Module( # load yumu Module bn
BERT_MODEL_HUB, # end solison
trainable=True)

This part how i add my custom bert model instead of hub.module?
i mean i have Mongolian language bert. I think this bert is only for English?
BERT_MODEL_HUB = "https://tfhub.dev/google/bert_uncased_L-12_H-768_A-12/1"

armandds / bert_for_long_text Goto Github PK

bert_for_long_text's Introduction

Repository for my medium's article: "Using BERT for Classifying Documents with long Text"

bert_for_long_text's People

Contributors

Stargazers

Watchers

Forkers

bert_for_long_text's Issues

AttributeError: module 'tensorflow._api.v2.train' has no attribute 'Optimizer'

Can't replicate the results

load local custom bert

bug in get_split

Overlap logic

Can not import BERT Module

batch_size_val

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent