prakashpandey9 / text-classification-pytorch Goto Github PK

View Code? Open in Web Editor NEW

800.0 21.0 238.0 32 KB

Text classification using deep learning models in Pytorch

License: MIT License

Python 100.00%

pytorch text-classification sentiment-classification lstm-model attention-model self-attention rnn-model glove

text-classification-pytorch's People

Contributors

Stargazers

Watchers

Forkers

jind11 kailigu rremani enod kid1412x cjopengler meshiguge dawnj chenxshuo moherx alchemist1024 leilin-research little1tow hintonhao barbecacov tvkpz ouya-bytes incogly jason08 bravestpeng huaizhengzhang da-southampton dhruvramani wangshuaiia jiasir803 thinkingstudio arpi-sa junnetworks hanchangzi klgraham nikita-mishunyayev littleflow3r innerfirexy alvations isunym sunyuanxi yjfiejd catmi666 yaozhian yairz1 zhangyijia1979 zzqoxygen swchoi0102 antyan001 lichao88 aishwarya-nr hoangcuong2011 deepandas11 binhna sarthusarth fishredleaf picksh iosnewbie2016 hiterstudy denffer bensentray lipanpanpanpan w-simon shbfy lwyanne nautiism keep-steady ramonqu7 kai04 xcalibur12 devanshb26 jordiaphane chunlinx senlau manavsinghal157 shiutang-li nlp-base kellyhoang0610 yixinsun lordstar stevenpeng8 leesehoon ye-man wpursue lunnada wibruce peizhaoliu pbagherzadeh summatic sxiazr yue1harriet1 whatyouknow123 raosudha89 ragnariock gzm9583 daniellvy santhoshsthanikam manolo20 jiabingzing jacksukk imlauzh marcucla hojunpark minki-kim95 lialu

text-classification-pytorch's Issues

Data not matching the batch size are discarded

Hi! Your code is of great help to me and thanks to you so much.

When experimenting with the same test data of CNN in your repo, I compare the result with TF-IDF+LR I wrote and found out that the test data is not totally used in testing. And I write a version like
with torch.no_grad(): for idx, batch in enumerate(val_iter): text = batch.Text[0] pass_batch_size= None #Since the batch size would be passed into the model, I set the variance to keep the data size in this iteration if (text.size()[0] is not batch_size): pass_batch_size = len(text) # pass_batch_size is the batch size of the batch target = batch.Label target = torch.autograd.Variable(target).long() if torch.cuda.is_available(): # text = text.cuda() # target = target.cuda() text =text.to(device) target = target.to(device) prediction = model(text,pass_batch_size) $ pass into the model
In LSTM.py, the model initializes the h_0 and c_0. I think it's the reason you define a parameter 'batch_size=None' in 'forward()'.

TypeError: init() got an unexpected keyword argument 'tensor_type'

Running the main.py got the error:

C:\Python36\python.exe I:/github_repos/Text-Classification-Pytorch/main.py
Traceback (most recent call last):
  File "I:/github_repos/Text-Classification-Pytorch/main.py", line 11, in <module>
    TEXT, vocab_size, word_embeddings, train_iter, valid_iter, test_iter = load_data.load_dataset()
  File "I:\github_repos\Text-Classification-Pytorch\load_data.py", line 31, in load_dataset
    LABEL = data.LabelField(tensor_type=torch.FloatTensor)
  File "C:\Python36\lib\site-packages\torchtext\data\field.py", line 699, in __init__
    super(LabelField, self).__init__(**kwargs)
TypeError: __init__() got an unexpected keyword argument 'tensor_type'

Process finished with exit code 1

why in RNN, the linear layer's input size is 4 * hidden_size?

If it is because that it is a bi-RNN, then shouldn't it be 2 times the hidden_size only?

Thanks

Does this project support Chinese data sets?

I will complete a Chinese emotion analysis task. Can this project complete the task of Chinese emotion analysis?

How to use external text data?

Hello, If I want to load separate external CSV files for training and testing where the first column of the CSV contains sentences and 2nd column contains the labels then where should I change?

Use of permute in RCNN

On lines 63 and 73 of Text-Classification-Pytorch/models/RCNN.py , function permute is used.

input = input.permute(1, 0, 2) # input.size() = (num_sequences, batch_size, embedding_length)
...
y = y.permute(0, 2, 1) # y.size() = (batch_size, hidden_size, num_sequences)

Could you please explain why it is necessary or useful to permute the dimensions of these tensors?

how to use other models?

I find that "main.py" only support "LSTM". I want to know how to use other models to make text classification,such as RNN,CNN,RCNN,attention,self-attention.I find other models need parameter "weight" which is not found in "main.py".How to set the "weight" parameter to call the other models?
Thanks a lot!

how to use the CNN model?

can you make an example of how to use the cnn model, like you did for lstm?
i have trouble understanding the parameters required to get the cnn up and running.

thanks in advance!

Why using Conv2d module if the paper refers to 1D convolutions?

Hi,

why are you using Conv2d module?
The author in the paper uses to 1D convolutions.

Thanks,
Pablo

the reference for the attention mechanism in LSTM_Attn

Would you please add the reference for the implementation details of the attention layer?

Where is the implementation for CNN and other models?

结果如何？

The code works for torchtext.datasets.SST.splits(TEXT, LABEL)?

Running torchtext.SST got the error:

/pytorch/aten/src/THCUNN/ClassNLLCriterion.cu:105: void cunn_ClassNLLCriterion_updateOutput_kernel(Dtype *, Dtype *, Dtype *, long *, Dtype *, int, int, int, int, long) [with Dtype = float, Acctype = float]: block: [0,0,0], thread: [0,0,0] Assertion t >= 0 && t < n_classesfailed. /pytorch/aten/src/THCUNN/ClassNLLCriterion.cu:105: void cunn_ClassNLLCriterion_updateOutput_kernel(Dtype *, Dtype *, Dtype *, long *, Dtype *, int, int, int, int, long) [with Dtype = float, Acctype = float]: block: [0,0,0], thread: [11,0,0] Assertiont >= 0 && t < n_classesfailed. /pytorch/aten/src/THCUNN/ClassNLLCriterion.cu:105: void cunn_ClassNLLCriterion_updateOutput_kernel(Dtype *, Dtype *, Dtype *, long *, Dtype *, int, int, int, int, long) [with Dtype = float, Acctype = float]: block: [0,0,0], thread: [19,0,0] Assertiont >= 0 && t < n_classesfailed. /pytorch/aten/src/THCUNN/ClassNLLCriterion.cu:105: void cunn_ClassNLLCriterion_updateOutput_kernel(Dtype *, Dtype *, Dtype *, long *, Dtype *, int, int, int, int, long) [with Dtype = float, Acctype = float]: block: [0,0,0], thread: [20,0,0] Assertiont >= 0 && t < n_classesfailed. /pytorch/aten/src/THCUNN/ClassNLLCriterion.cu:105: void cunn_ClassNLLCriterion_updateOutput_kernel(Dtype *, Dtype *, Dtype *, long *, Dtype *, int, int, int, int, long) [with Dtype = float, Acctype = float]: block: [0,0,0], thread: [26,0,0] Assertiont >= 0 && t < n_classes failed. Traceback (most recent call last): File "./main.py", line 91, in <module> train_loss, train_acc = train_model(model, train_iter, epoch) File "./main.py", line 44, in train_model loss.backward() File "....", line 118, in backward torch.autograd.backward(self, gradient, retain_graph, create_graph) File "....", line 93, in backward allow_unreachable=True) # allow_unreachable flag RuntimeError: cuda runtime error (59) : device-side assert triggered at /pytorch/aten/src/THC/generic/THCTensorMath.cu:26

I checked the target values:

tensor([2, 1, 1, 0, 0, 1, 1, 0, 0, 1, 1, 2, 1, 0, 0, 0, 0, 0, 1, 2, 2, 0, 1, 1, 0, 1, 2, 1, 1, 0, 1, 1], device='cuda:0')

there is "2" here that I think makes the problem. Would you please guide me regarding this issue?

How can I run LSTM using my own data

Please help

Bert embeddings

I am trying to use the models u implemented with bert embedding for Arabic language but I am getting very low accuracy. So I am wondering if I am doing things wrong especially that I al newby to deep learning
here is my modification to the attention model

            class SelfAttention(nn.Module):
        def __init__(self, batch_size, output_size, hidden_size, bert):
	super(SelfAttention, self).__init__()

	"""
	Arguments
	---------
	batch_size : Size of the batch which is same as the batch_size of the data returned by the TorchText BucketIterator
	output_size : 2 = (pos, neg)
	hidden_sie : Size of the hidden_state of the LSTM
	vocab_size : Size of the vocabulary containing unique words
	embedding_length : Embeddding dimension of GloVe word embeddings
	weights : Pre-trained GloVe word_embeddings which we will use to create our word_embedding look-up table 
	
	--------
	
	"""

	self.batch_size = batch_size
	self.output_size = output_size
	self.hidden_size = hidden_size
	#self.vocab_size = vocab_size
	self.bert = bert
	
	embedding_length=bert.config.to_dict()['hidden_size']
	print(embedding_length)
	#self.word_embeddings = nn.Embedding(vocab_size, embedding_dim)
	#self.word_embeddings.weights = nn.Parameter(weights, requires_grad=False)
	self.dropout = 0.8
	self.bilstm = nn.LSTM(embedding_length, hidden_size, dropout=self.dropout, bidirectional=True)
	# We will use da = 350, r = 30 & penalization_coeff = 1 as per given in the self-attention original ICLR paper
	self.W_s1 = nn.Linear(2*hidden_size, 350)
	self.W_s2 = nn.Linear(350, 30)
	self.fc_layer = nn.Linear(30*2*hidden_size, 2000)
	self.label = nn.Linear(2000, output_size)

def attention_net(self, lstm_output):

	"""
	Now we will use self attention mechanism to produce a matrix embedding of the input sentence in which every row represents an
	encoding of the inout sentence but giving an attention to a specific part of the sentence. We will use 30 such embedding of 
	the input sentence and then finally we will concatenate all the 30 sentence embedding vectors and connect it to a fully 
	connected layer of size 2000 which will be connected to the output layer of size 2 returning logits for our two classes i.e., 
	pos & neg.
	Arguments
	---------
	lstm_output = A tensor containing hidden states corresponding to each time step of the LSTM network.
	---------
	Returns : Final Attention weight matrix for all the 30 different sentence embedding in which each of 30 embeddings give
			  attention to different parts of the input sentence.
	Tensor size : lstm_output.size() = (batch_size, num_seq, 2*hidden_size)
				  attn_weight_matrix.size() = (batch_size, 30, num_seq)
	"""
	attn_weight_matrix = self.W_s2(F.tanh(self.W_s1(lstm_output)))
	attn_weight_matrix = attn_weight_matrix.permute(0, 2, 1)
	attn_weight_matrix = F.softmax(attn_weight_matrix, dim=2)

	return attn_weight_matrix

def forward(self, input_sentences, batch_size=None):
	
	""" 
	Parameters
	----------
	input_sentence: input_sentence of shape = (batch_size, num_sequences)
	batch_size : default = None. Used only for prediction on a single sentence after training (batch_size = 1)
	
	Returns
	-------
	Output of the linear layer containing logits for pos & neg class.
	
	"""
	with torch.no_grad():
		input = self.bert(input_sentences)[0]

	input = input.permute(1, 0, 2)
	if batch_size is None:
		h_0 = Variable(torch.zeros(2, self.batch_size, self.hidden_size).cuda())
		c_0 = Variable(torch.zeros(2, self.batch_size, self.hidden_size).cuda())
	else:
		h_0 = Variable(torch.zeros(2, batch_size, self.hidden_size).cuda())
		c_0 = Variable(torch.zeros(2, batch_size, self.hidden_size).cuda())

	output, (h_n, c_n) = self.bilstm(input, (h_0, c_0))
	output = output.permute(1, 0, 2)
	# output.size() = (batch_size, num_seq, 2*hidden_size)
	# h_n.size() = (1, batch_size, hidden_size)
	# c_n.size() = (1, batch_size, hidden_size)
	attn_weight_matrix = self.attention_net(output)
	# attn_weight_matrix.size() = (batch_size, r, num_seq)
	# output.size() = (batch_size, num_seq, 2*hidden_size)
	hidden_matrix = torch.bmm(attn_weight_matrix, output)
	# hidden_matrix.size() = (batch_size, r, 2*hidden_size)
	# Let's now concatenate the hidden_matrix and connect it to the fully connected layer.
	fc_out = self.fc_layer(hidden_matrix.view(-1, hidden_matrix.size()[1]*hidden_matrix.size()[2]))
	logits = self.label(fc_out)
	# logits.size() = (batch_size, output_size)

	return logits

could u help please

Thanks

How can i run this application cpu configuration

Hello i am new to this application. can you please help me to run this application on CPU configuration