cjiang2 / vdcnn Goto Github PK

View Code? Open in Web Editor NEW

171.0 171.0 41.0 43 KB

Implementation of Very Deep Convolutional Neural Network for Text Classification

Python 100.00%

convolutional-neural-networks keras keras-tensorflow nlp nlp-datasets tensorflow text-classification vdcnn

vdcnn's People

Contributors

Stargazers

Watchers

vdcnn's Issues

There are a `for` loop in `Convolutional_Block` and only the last conv out is used,why?

The code is:

for i in range(2):
  with tf.variable_scope("conv1d_%s" % str(i)):
      filter_shape = [3, inputs.get_shape()[2], num_filters]
      W = tf.get_variable(name='W', shape=filter_shape, 
          initializer=he_normal,
          regularizer=regularizer)
      out = tf.nn.conv1d(inputs, W, stride=1, padding="SAME")
      out = tf.layers.batch_normalization(inputs=out, momentum=0.997, epsilon=1e-5, 
                                      center=True, scale=True, training=is_training)
      out = tf.nn.relu(out)
      print("Conv1D:", out.get_shape())

Training fails

Hi,

I run your code on ag_news, but every time encounter a training failure. (test acc is 25%).

Look like the way you recovered has some issues. any ideas about this?

I am getting a Memory Error when trying to run for yahoo answers dataset!I ll try to merge in dataset api in order to cache batches and not the whole dataset into memory.If i get through this maybe I ll fork from yours and contribute to repo!(The bug isn't sprecified exactly because what I get is Memory Error and not something more indicative)

why use tf.layers.conv1d for shorcut and pool in function downsampling?

I think it should be removed

Issue about running your code

I encounteded the following question when running your code(tf == 1.1.0):

Traceback (most recent call last):
File "C:/Users/syrup/Documents/VDCNN-master/train.py", line 47, in
use_k_max_pooling=FLAGS.use_k_max_pooling)
File "C:\Users\syrup\Documents\VDCNN-master\vdcnn_9.py", line 54, in init
initializer=conv_initializer)
File "C:\Program Files\Anaconda3\lib\site-packages\tensorflow\python\ops\variable_scope.py", line 1049, in get_variable
use_resource=use_resource, custom_getter=custom_getter)
File "C:\Program Files\Anaconda3\lib\site-packages\tensorflow\python\ops\variable_scope.py", line 948, in get_variable
use_resource=use_resource, custom_getter=custom_getter)
File "C:\Program Files\Anaconda3\lib\site-packages\tensorflow\python\ops\variable_scope.py", line 356, in get_variable
validate_shape=validate_shape, use_resource=use_resource)
File "C:\Program Files\Anaconda3\lib\site-packages\tensorflow\python\ops\variable_scope.py", line 341, in _true_getter
use_resource=use_resource)
File "C:\Program Files\Anaconda3\lib\site-packages\tensorflow\python\ops\variable_scope.py", line 714, in _get_single_variable
validate_shape=validate_shape)
File "C:\Program Files\Anaconda3\lib\site-packages\tensorflow\python\ops\variables.py", line 197, in init
expected_shape=expected_shape)
File "C:\Program Files\Anaconda3\lib\site-packages\tensorflow\python\ops\variables.py", line 275, in _init_from_args
initial_value(), name="initial_value", dtype=dtype)
File "C:\Program Files\Anaconda3\lib\site-packages\tensorflow\python\ops\variable_scope.py", line 690, in
shape.as_list(), dtype=dtype, partition_info=partition_info)
TypeError: call() got an unexpected keyword argument 'partition_info'

Cannot save model

Hi, your code is really an interesting implementation.
However, I faced problems when saving the model when running model.to_json() in train.py
The error message looks like:

File "train.py", line 71, in train
model_json = model.to_json()
File "/home/mzli/.local/lib/python3.8/site-packages/tensorflow/python/keras/engine/training.py", line 2264, in to_json
model_config = self._updated_config()
File "/home/mzli/.local/lib/python3.8/site-packages/tensorflow/python/keras/engine/training.py", line 2221, in _updated_config
config = self.get_config()
File "/home/mzli/.local/lib/python3.8/site-packages/tensorflow/python/keras/engine/functional.py", line 598, in get_config
return copy.deepcopy(get_network_config(self))
File "/home/mzli/.local/lib/python3.8/site-packages/tensorflow/python/keras/engine/functional.py", line 1278, in get_network_config
layer_config = serialize_layer_fn(layer)
File "/home/mzli/.local/lib/python3.8/site-packages/tensorflow/python/keras/utils/generic_utils.py", line 250, in serialize_keras_object
raise e
File "/home/mzli/.local/lib/python3.8/site-packages/tensorflow/python/keras/utils/generic_utils.py", line 245, in serialize_keras_object
config = instance.get_config()
File "/home/mzli/.local/lib/python3.8/site-packages/tensorflow/python/keras/engine/base_layer.py", line 676, in get_config
raise NotImplementedError('Layer %s has arguments in __init__ and '
NotImplementedError: Layer KMaxPooling has arguments in __init__ and therefore must override get_config.

May I get some advice on this issue? Many thanks!!

Unknown characters

In the original paper they allocate an encoding character for all characters outside the range they actually encode. It isn't obvious to me that you have done this in your code. Any reason? Or am I just not seeing where that is being done?

Minor Bug

ag news dataset test doesnt reach 1024 threshold but if you run the code for yahoo answers which reaches beyond 102 you'll get an error because you substitute character before checking for reaching 1024.The condition of reaching max sequence length should be on top of all!

decay step problem

HI,
In the train.py, there is a line of code
lr_decay_fn = lambda lr, global_step : tf.train.exponential_decay(lr, global_step, 100, 0.95, staircase=True)
May I ask why the decay step set to be 100? I saw other code sometimes set to around 10000.
How to decide this paramter?

Questions about the test error

I‘ve run your code on AG News dataset, and I get high accuracy in train step,but a relatively lower and unstable accuracy in test step. If I set the is_training=Ture in test step, I will get a good result, is there problems in the batch norm?
What is the use of fixed_padding after pooling layers, I did't see such an operation in the original paper.

bug in load_csv_file()

the code in line 33 is:
if i > sequence_max_length - 1:
I think it should be:
if i >= sequence_max_length - 1

load_csv_file() only loads description field

text = row['fields'][-1].lower()

This code in function load_csv_file() doesn't load title part of the text. However, I think in the paper, both title and description part of the text are put into training.

downsampling return should use same padding not valid

return tf.layers.conv1d(
inputs=pool, filters=pool.get_shape()[2] * 2, kernel_size=1,
strides=1, padding='same', use_bias=False)

accuracy question when running vdcnn29

hello，
while running vdcnn9 use AG dataset without k maxpooling，I got result same as the paper（89.83%）
but when goes to vdcnn29 with k maxpooling，I only got accuracy 90.4% while the paper report 91.27%
I want to know the acc when you went through this network，thanks！

confused about the embedding

great work dude, sorry for my rude, I was just confused about the embedding layer in code

for my knowledge, it usually be Embedding(input_dim=$vocab_size, output_dim=$embedded_size, input_length=$input_size)

Regarding Data Set

Hello,

          For Dbpedia Data set and other data set like Amazon review they have 3 columns. 
          First one is class or target. Between second column and third column which one we have to select. 
          I have trained  different model on third column data. I have ignored 2nd column(What is the use of that). 
          I got 99.02% test accuracy. I has beat author results. Does I have taken right columns?

Thanks

Please state which license (Apache?)

Thanks for doing this work - it saves me from following the same paper and reproducing it myself. At least it will if you are willing to give this an Apache license, as I need to implement a commercial version of this and cannot without a license that allows me to.

Are you willing to issue under the Apache license and add that to the source code and MD?

cjiang2 / vdcnn Goto Github PK

vdcnn's People

Contributors

Stargazers

Watchers

Forkers

vdcnn's Issues

Recommend Projects

Recommend Topics

Recommend Org