dongjun-lee / text-classification-models-tf Goto Github PK

View Code? Open in Web Editor NEW

505.0 19.0 166.0 11 KB

Tensorflow implementations of Text Classification Models.

Python 100.00%

tensorflow text-classification

text-classification-models-tf's Introduction

Text Classification Models with Tensorflow

Tensorflow implementation of Text Classification Models.

Implemented Models:

Word-level CNN [paper]
Character-level CNN [paper]
Very Deep CNN [paper]
Word-level Bidirectional RNN
Attention-Based Bidirectional RNN [paper]
RCNN [paper]

Semi-supervised text classification(Transfer learning) models are implemented at [dongjun-Lee/transfer-learning-text-tf].

Requirements

Python3
Tensorflow
pip install -r requirements.txt

Usage

Train

To train classification models for dbpedia dataset,

$ python train.py --model="<MODEL>"

Test

To test classification accuracy for test data after training,

$ python test.py --model="<TRAINED_MODEL>"

Sample Test Results

Trained and tested with dbpedia dataset. (dbpedia_csv/train.csv, dbpedia_csv/test.csv)

Model	WordCNN	CharCNN	VDCNN	WordRNN	AttentionRNN	RCNN	*SA-LSTM	*LM-LSTM
Accuracy	98.42%	98.05%	97.60%	98.57%	98.61%	98.68%	98.88%	98.86%

(SA-LSTM and LM-LSTM are implemented at [dongjun-Lee/transfer-learning-text-tf].)

Models

4. Word-level Bi-RNN

Bi-directional RNN for Text Classification.

Embedding layer
Bidirectional RNN layer
Concat all the outputs from RNN layer
Fully-connected layer

5. Attention-Based Bi-RNN

Implementation of Attention-Based Bidirectional Long Short-Term Memory Networks for Relation Classification.

6. RCNN

Implementation of Recurrent Convolutional Neural Networks for Text Classification.

References

text-classification-models-tf's People

Contributors

Stargazers

Watchers

Forkers

templeblock jdc08161063 little1tow chavesliu davidpeng11 buaaspy allensmile huihui7987 chj11 guanlongtianzi nilportugues tgpgithub xfzhu2003 gujiacun wangyiyao2016 ahshanmd lxwuguang xingzai0617 haonanli panxiebit kifish trendingtechnology mathshelly2014 xiaojie18 fengsee devdgit skp80 manojgl anastasiostsak lbda1 keerthana234 karanr93 awasthimaddy gridl nikolayvoronchikhin jshuadvd devdataanalyst meelement iamsile zhang-yd15 buaaxukan yiqiaoxu-joe lu839684437 yixiu00001 belalmohsen zhouyonglong saumopal97 orangefly0214 gonewithgt pzhao16me caibing1872 jkhlot akhileshkumargangwar yuyuvenus rockets-cn utkarsh5555 lansatiankong chaoongithub sumitpai delaiahz arsentiii haif-liu xuanheiiis melindadevins carolinexull nick-choudhary leonhanml kylin0228 dthboyd ashwanitr grace-congxin zjordan-zhang esskay0000 jwkanggist hhh920406 tiger115136 nonva fendaq jasonaidm bellazbl gzilt-playground siamweb pickonecat wkryst sheetal804 ravithej dazzysakb ammarajmal newtonlicciardijr pankajsoni17 lailoo datacontrol haofengrushui204 earlbabson geshili wangdxf meccy davis-love-ai evifree tiffen

text-classification-models-tf's Issues

chinese text ?

how to predict for one sentence?

vd_cnn.py中的conv_block是否第一次循环的conv没有用上？

每次卷积输入的都是input，第二次循环也是conv变量，这样第一次循环不是就用不上了？

Got this trouble running on Google Cloud Platform TPU

Tensorflow 2.x
Python 3.7.3

Traceback (most recent call last):
File "train3.py", line 74, in
model = WordCNN(vocabulary_size, WORD_MAX_LEN, NUM_CLASS)
File "/home/migueltuxd/bucket4testingtpuss/TPU/text-classification-models-tf-master/cnn_models/word_cnn.py", line 54, in init
self.optimizer = tf.train.AdamOptimizer(self.learning_rate).minimize(self.loss, global_step=self.global_step)
File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/training/optimizer.py", line 413, in minimize
name=name)
File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/training/optimizer.py", line 564, in apply_gradients
raise RuntimeError("Use _distributed_apply() instead of "
RuntimeError: Use _distributed_apply() instead of apply_gradients() in a cross-replica context.

Shared too on stackoverflow:https://stackoverflow.com/questions/61704387/wordcnn-trouble-with-distributed-apply-and-apply-gradients

data_utils

data_utils.py function build_char_dataset

some seem not very reasonable:

when model = "char_cnn":  
    char_dict["<pad>"] = char_dict["<unk>"] = char_dict["a"] = 0 after onehot in char_cnn.py

when model = "vd_cnn": 
    char_dict["<pad>"] = char_dict["a"] = 0, 
    char_dict["<unk>"] = char_dict["b"] = 1

What version of Tensorflow the code runs with?

It crashes in Tensorflow 1.4 with the error below:

Traceback (most recent call last):
File "train.py", line 47, in
model = VDCNN(alphabet_size, CHAR_MAX_LEN, NUM_CLASS)
File "text-classification-models-tf/cnn_models/vd_cnn.py", line 70, in init
tf.nn.softmax_cross_entropy_with_logits_v2(logits=self.logits, labels=y_one_hot))
AttributeError: module 'tensorflow.python.ops.nn' has no attribute 'softmax_cross_entropy_with_logits_v2'

Word-level CNN model's result is poor in SST-2 dataset.

Have you test the SST-2 dataset ?
I have run word_cnn model on SST-2 dataset . Only get 0.38 accuracy. But the accuray is 0.45 in paper.
How to change the model to improve the accuracy? Is something wrong?
The word_rnn model can get 0.45 accuracy, why diff is so large?

How can I get prediction results at test.py?

How can I get prediction results in test.py?
The implementation now only has accuracy, so it's hard to see precision, recall, f1, or the prediction itself.

ResourceExhaustedError: OOM

find a error when train the att_rnn model

Machine: 4 * Tesla P100-PCIE-16GB, memory: 256G

ResourceExhaustedError (see above for traceback): OOM when allocating tensor of shape [563354,256] and type float [[Node: embeddings/Adam/Initializer/zeros = Const[dtype=DT_FLOAT, value=Tensor<type: float shape: [563354,256] values: [0 0 0]...>, _device="/job:localhost/replica:0/task:0/device:GPU:0"]()]]