tobiaslee / text-classification Goto Github PK

View Code? Open in Web Editor NEW

738.0 31.0 200.0 78 KB

Implementation of papers for text classification task on DBpedia

License: Apache License 2.0

Python 100.00%

text-classification attention tensorflow lstm cnn

text-classification's Introduction

Text-Classification

Implement some state-of-the-art text classification models with TensorFlow.

Requirement

Python3
TensorFlow >= 1.4

Note: Original code is written in TensorFlow 1.4, while the VocabularyProcessor is depreciated, updated code changes to use tf.keras.preprocessing.text to do preprocessing. The new preprocessing function is named data_preprocessing_v2

Dataset

You can load the data with

dbpedia = tf.contrib.learn.datasets.load_dataset('dbpedia', test_with_fake_data=FLAGS.test_with_fake_data)

Or download it from Baidu Yun.

Attention is All Your Need

Paper: Attention Is All You Need

See multi_head.py

Use self-attention where Query = Key = Value = sentence after word embedding

Multihead Attention module is implemented by Kyubyong

IndRNN for Text Classification

Paper: Independently Recurrent Neural Network (IndRNN): Building A Longer and Deeper RNN

IndRNNCell is implemented by batzener

Attention-Based Bidirection LSTM for Text Classification

Paper: Attention-Based Bidirectional Long Short-Term Memory Networks for Relation Classification

See attn_bi_lstm.py

Hierarchical Attention Networks for Text Classification

Paper: Hierarchical Attention Networks for Document Classification

See attn_lstm_hierarchical.py

Attention module is implemented by ilivans/tf-rnn-attention .

Adversarial Training Methods For Supervised Text Classification

Paper: Adversarial Training Methods For Semi-Supervised Text Classification

See: adversrial_abblstm.py

Convolutional Neural Networks for Sentence Classification

Paper: Convolutional Neural Networks for Sentence Classification

See: cnn.py

RMDL: Random Multimodel Deep Learning for Classification

Paper: RMDL: Random Multimodel Deep Learning for Classification

See: RMDL.py See: RMDL Github

Note: The parameters are not fine-tuned, you can modify the kernel as you want.

Performance

Model	Test Accuracy	Notes
Attention-based Bi-LSTM	98.23 %
HAN	89.15%	1080Ti 10 epochs 12 min
Adversarial Attention-based Bi-LSTM	98.5%	AWS p2 2 hours
IndRNN	98.39%	1080Ti 10 epochs 10 min
Attention is All Your Need	97.81%	1080Ti 15 epochs 8 min
RMDL	98.91%	2X Tesla Xp (3 RDLs)
CNN	98.37%

Welcome To Contribute

If you have any models implemented with great performance, you're welcome to contribute. Also, I'm glad to help if you have any problems with the project, feel free to raise a issue.

text-classification's People

Contributors

Stargazers

Watchers

Forkers

iamsile dem-esgal mmelodious wodeweilai yuyuqi yanlirock mdongbenben binwone jozerozero wpfhtl arixlin sunrash tsaingra caoxu915683474 hiterstone xiaomi2008 ky6-2017 fendaq jsonko syunzhou whaozl rafarui liybu36 whyxzh junwoopark92 skblaz vanpersie32 saber5433 erayon kk7nc rahul-38-26-0111-0003 shubhampachori12110095 yxk9810 luciencheng yushu-liu luojianp flandrinorzxy zhoudayang qss2012 guanlongtianzi yaqinzhou guaibaoer jkhlot allensmile zhenjun-fan chuckwoody hhh920406 charlottesean leonhanml giserh phychaos wangyiyan3318 chaoyue729 linggood sunnymarkliu machine4life zhongyunuestc richardsun-voyager lovehoroscoper daiyl yishuihanhan jackmtlee 4ai earlzz chaoongithub pan-rongqing datasuperman tianyikenan mma1979 sdd031215 cshaowang liclone dawei6875797 allen840707 shmilychomi hughqs zyxpaidaxing hzchuan 7472741 xianhuaxizi willardtm ruiatelsevier dreamliking andrew05200 sadique-adnan iwii0425 prariehill haif-liu 53x dcaragea jaymiliu caiyuan-zheng yuchen17heaven anhduc2203 591094733 yashchoubey for-research larrycheng masterchop datazwer

text-classification's Issues

cannot load the dataset

The tf.contrib module is depreciated so the data can't be loaded.
Also the link given to download the data is not working anymore.
Could you please check this?

怎么存模型呢

请问这个模型应该怎么存，我想写一个单独的预测函数，在训练位置saver.save(sess, '../save_model/atttn_lstm_hierarchical/')
在预测的地方加载模型的时候使用

 with tf.Session()as sess:
        saver = tf.train.import_meta_graph('./save_model/atttn_lstm_hierarchical/-1000.meta')
        saver.restore(sess, tf.train.latest_checkpoint('./save_model/atttn_lstm_hierarchical'))

但是报错如下：

请问是什么问题呢

adversarial_abblstm.py - validation accuracy: 0% / test accuracy: 0%

I am running your code without modifications on the dbpedia dataset and is receiving non-valid results as follows:

..
Validation accuracy: 0.000
Training finished, time consumed : 133.85788369178772 s
Start evaluating:
Test accuracy : 0.000000 %

wrong with cnn.py

Could you tell me what's wrong with "Process finished with exit code 139 (interrupted by signal 11: SIGSEGV)" when I run cnn.py?

关于 Adversarial Training Methods For Semi-Supervised Text Classification代码中的一个问题

您好我想问一下

   logits, self.cls_loss = cal_loss_logit(batch_embedded, self.keep_prob, reuse=False)
    embedding_perturbated = self._add_perturbation(batch_embedded, self.cls_loss)
    adv_logits, self.adv_loss = cal_loss_logit(embedding_perturbated, self.keep_prob, reuse=True)

这里为什么在不加扰动的时候，reuse=False，为什么在计算加上扰动求loss的时候reuse=True呢

谢谢~

Wrong output dimension of the embedding_lookup table

I looked up again and again and the dimension of the output of tf.embedding_lookup should be [?,256,128], but you have written it as [?,256,100](in comments) either I couldn't understand the concept or there's something really wrong there. Please clarify.

A problem about the Paper: Adversarial Training Methods For Semi-Supervised Text Classification

Hi:
I have read the paper: Adversarial Training Methods For Semi-Supervised Text Classification.
Does the normalization of the input vectors need to be added if the input embeddings are not trainable ？ Because I know in the case of the trainable input embeddings, the normalization must exist.
What do you have in mind ？

attn_bi_lstm.py模型的y_hat那里是不是写错了？

Text-Classification/models/attn_bi_lstm.py

Line 48 in 8fd3a7b

 FC_W = tf.Variable(tf.truncated_normal([self.hidden_size, self.max_len], stddev=0.1)) 

这里马上要求出y的预测值了，而类别只有15个。而不是32个。
这里FC_W 的shape应该是：
[self.hidden_size, self.n_class]
而不是
[self.hidden_size, self.max_len]
吧？下面的FC_b 也是同样的问题。
老哥来看看是不是有问题，还是我理解错了~ @TobiasLee

Use pre-trained embedding instead of randome one

Hi there, as I did see in your attn_bi_lstm.py (maybe also other approaches), used a random technique in embedding words. I did give it a try to use pre-trained embedding, however, I do not know how to set it up, also get an error with "must have rank at least 3" (so sorry, I am newer to Tensorflow). Thank you and much appreciate.

Word embedding

    embeddings_var = tf.Variable(tf.random_uniform([self.vocab_size, self.embedding_size], -1.0, 1.0),
                                 trainable=True)
    batch_embedded = tf.nn.embedding_lookup(embeddings_var, self.x)

    rnn_outputs, _ = bi_rnn(BasicLSTMCell(self.hidden_size),
                            BasicLSTMCell(self.hidden_size),
                            inputs=batch_embedded, dtype=tf.float32)

My trial

def generate_embedding(word_index, model_embedding,EMBEDDING_DIM):
  count6 = 0
   countNot6 = 0
   #embedding_matrix = np.zeros((len(word_index) + 1, EMBEDDING_DIM)) 
   embedding_matrix = np.asarray([np.random.uniform(-0.01,0.01,EMBEDDING_DIM) for _ in range((len(word_index) + 1))])
   list_oov = []
   for word, i in word_index.items():
       try:
           embedding_vector = model_embedding[word]
       except:
           list_oov.append(word)
           countNot6 +=1
           continue
       if embedding_vector is not None:
           count6 +=1
           embedding_matrix[i] = embedding_vector
   return embedding_matrix

batch_embedded = generate_embedding(word_index,word_embedding,EMBEDDING_DIM)
rnn_outputs, _ = bi_rnn(BasicLSTMCell(self.hidden_size),
                               BasicLSTMCell(self.hidden_size),
                               inputs=batch_embedded, dtype=tf.float32)

Note that, I got an error at inputs=batch_embedded

How can I load data

when I run the statement, it gives me an NameError
"NameError: name 'FLAGS' is not defined"
Could you please tell me what is 'FLAGS' referring to?

validation and testing accuracy=0

First of all thank you for your effort. My problem is that the implementation run correctly when I use dbpedia dataset, but when I try to my dataset I got an accuracy of 0. My dataset is in the Arabic language.

Test Accuracy is lower than the Performance in Readme

Epoch 19 start !
Train Epoch time: 108.163 s
validation accuracy: 0.932
Epoch 20 start !
Train Epoch time: 107.236 s
validation accuracy: 0.936
Training finished, time consumed : 2188.0112912654877 s
Start evaluating:

Test accuracy : 93.619048 %

This is the performance in attn_bi_lstm.py。 Why it’s lower than 98.23 % in readme?
Thanks!

data not exist

i ran your model "adversarial_abblstm.py", but the compiler shows file /dbpedia_data/dbpedia_csv/train.csv does not exist, could you tell me where to download the train.csv and text.csv?

ValueError: Cannot feed value of shape (32, 15) for Tensor 'Placeholder_1:0', which has shape '(?,)

I am getting above error when I run bilstm attention program on DBpedia dataset

Traceback (most recent call last):
File "attn_bi_lstm.py", line 112, in
return_dict = run_train_step(classifier, sess, (x_batch, y_batch))
File "/home/kbk/Desktop/BudddiHealth/higher models/Text-Classification-master/models/utils/model_helper.py", line 26, in run_train_step
return sess.run(to_return, feed_dict)
File "/home/kbk/miniconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 877, in run
run_metadata_ptr)
File "/home/kbk/miniconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1076, in _run
str(subfeed_t.get_shape())))
ValueError: Cannot feed value of shape (32, 15) for Tensor 'Placeholder_1:0', which has shape '(?,)'

Error in attn_bi_lstm.py while feeding data label during training

> python attn_bi_lstm.py

InvalidArgumentError (see above for traceback): Received a label value of 
-2147483648 which is outside the valid range of [0, 15).  Label values: -2147483648
 -2147483648 -2147483648 -2147483648 -2147483648 -2147483648 
-2147483648 -2147483648 -2147483648 -2147483648 -2147483648
 -2147483648 -2147483648 -2147483648 -2147483648 -2147483648 
-2147483648 -2147483648 -2147483648 -2147483648 -2147483648 
-2147483648 -2147483648 -2147483648 -2147483648 -2147483648 
-2147483648 -2147483648 -2147483648 -2147483648 -2147483648 -2147483648
	 [[{{node SparseSoftmaxCrossEntropyWithLogits/SparseSoftmaxCrossEntropyWithLogits}}
 = SparseSoftmaxCrossEntropyWithLogits[T=DT_FLOAT, Tlabels=DT_INT32, 
_device="/job:localhost/replica:0/task:0/device:CPU:0"](xw_plus_b, _arg_Placeholder_1_0_1)]]

Printing y_train values:


5164    NaN
3458    NaN
3236    NaN
3118    NaN
1555    NaN
930     NaN
3188    NaN
2899    NaN
2918    NaN
1431    NaN
2373    NaN
1205    NaN
2734    NaN
2560    NaN
1495    NaN
5430    NaN
2912    NaN
2098    NaN
2410    NaN
4482    NaN
1045    1.0
2469    NaN
1703    NaN
250     NaN
5214    NaN
4767    NaN
849     NaN
976     NaN
5489    NaN
5545    NaN
5241    NaN
3128    NaN

Thanks

attn_bi_lstm.py

excuse me, the code in attn_bi_lstm.py, the graph of ABLSTM, I don't see you use "attention"。Maybe you just only use Bi-LSTM?

Can i ask the editionof the dbpediafile

download it from Baidu Yun.