I'm trying to reproduce the test result on douban corpus dataset.
(DAM) zhibo@cvda-ultra:~/zhibo/DAM$ python main.py
loading word emb init
starting loading data
2018-09-27 16:39:54
finish loading data
finish building test batches
2018-09-27 16:41:20
configurations: {'vocab_size': 434512, 'num_scan_data': 2, 'data_path': './data/douban/data.pkl', 'max_turn_num': 9, 'emb_size': 200, 'is_mask': True, 'drop_attention': None, 'word_emb_init': './data/douban/word_embedding.pkl', 'save_path': './output/douban/temp/', 'is_positional': False, 'is_layer_norm': True, '_EOS_': 1, 'learning_rate': 0.001, 'drop_dense': None, 'rand_seed': None, 'final_n_class': 1, 'batch_size': 200, 'attention_type': 'dot', 'max_turn_len': 50, 'max_to_keep': 1, 'init_model': './output/douban/DAM/DAM.ckpt', 'stack_num': 5}
WARNING:tensorflow:From /home1/zhibo/codebase/DAM/utils/operations.py:157: calling reduce_mean (from tensorflow.python.ops.math_ops) with keep_dims is deprecated and will be removed in a future version.
Instructions for updating:
keep_dims is deprecated, use keepdims instead
sim shape: (200, 9, 50, 50, 12)
conv_0 shape: (200, 9, 50, 50, 32)
pooling_0 shape: (200, 3, 17, 17, 32)
conv_1 shape: (200, 3, 17, 17, 16)
pooling_1 shape: (200, 1, 6, 6, 16)
build graph sucess
2018-09-27 16:44:29
2018-09-27 16:44:29.380616: I tensorflow/core/platform/cpu_feature_guard.cc:137] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA
2018-09-27 16:44:29.608946: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1105] Found device 0 with properties:
name: GeForce GTX 1080 Ti major: 6 minor: 1 memoryClockRate(GHz): 1.582
pciBusID: 0000:05:00.0
totalMemory: 10.91GiB freeMemory: 1.72GiB
2018-09-27 16:44:29.797182: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1105] Found device 1 with properties:
name: GeForce GTX 1080 Ti major: 6 minor: 1 memoryClockRate(GHz): 1.582
pciBusID: 0000:06:00.0
totalMemory: 10.92GiB freeMemory: 9.18GiB
2018-09-27 16:44:30.025387: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1105] Found device 2 with properties:
name: GeForce GTX 1080 Ti major: 6 minor: 1 memoryClockRate(GHz): 1.582
pciBusID: 0000:09:00.0
totalMemory: 10.92GiB freeMemory: 10.55GiB
2018-09-27 16:44:30.263003: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1105] Found device 3 with properties:
name: GeForce GTX 1080 Ti major: 6 minor: 1 memoryClockRate(GHz): 1.582
pciBusID: 0000:0a:00.0
totalMemory: 10.92GiB freeMemory: 8.12GiB
2018-09-27 16:44:30.263730: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1120] Device peer to peer matrix
2018-09-27 16:44:30.263833: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1126] DMA: 0 1 2 3
2018-09-27 16:44:30.263845: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1136] 0: Y Y Y Y
2018-09-27 16:44:30.263853: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1136] 1: Y Y Y Y
2018-09-27 16:44:30.263863: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1136] 2: Y Y Y Y
2018-09-27 16:44:30.263870: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1136] 3: Y Y Y Y
2018-09-27 16:44:30.263882: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1195] Creating TensorFlow device (/device:GPU:0) -> (device: 0, name: GeForce GTX 1080 Ti, pci bus id: 0000:05:00.0, compute capability: 6.1)
2018-09-27 16:44:30.263891: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1195] Creating TensorFlow device (/device:GPU:1) -> (device: 1, name: GeForce GTX 1080 Ti, pci bus id: 0000:06:00.0, compute capability: 6.1)
2018-09-27 16:44:30.263899: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1195] Creating TensorFlow device (/device:GPU:2) -> (device: 2, name: GeForce GTX 1080 Ti, pci bus id: 0000:09:00.0, compute capability: 6.1)
2018-09-27 16:44:30.263907: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1195] Creating TensorFlow device (/device:GPU:3) -> (device: 3, name: GeForce GTX 1080 Ti, pci bus id: 0000:0a:00.0, compute capability: 6.1)
Traceback (most recent call last):
File "main.py", line 62, in <module>
test.test(conf, model)
File "/home1/zhibo/codebase/DAM/bin/test_and_evaluate.py", line 41, in test
_model.saver.restore(sess, conf["init_model"])
File "/home/zhibo/anaconda3/envs/DAM/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 1686, in restore
{self.saver_def.filename_tensor_name: save_path})
File "/home/zhibo/anaconda3/envs/DAM/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 895, in run
run_metadata_ptr)
File "/home/zhibo/anaconda3/envs/DAM/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1128, in _run
feed_dict_tensor, options, run_metadata)
File "/home/zhibo/anaconda3/envs/DAM/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1344, in _do_run
options, run_metadata)
File "/home/zhibo/anaconda3/envs/DAM/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1363, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Assign requires shapes of both tensors to match. lhs shape= [434513,200] rhs shape= [172131,200]
[[Node: loss/save/Assign_293 = Assign[T=DT_FLOAT, _class=["loc:@word_embedding"], use_locking=true, validate_shape=true, _device="/job:localhost/replica:0/task:0/device:GPU:0"](loss/word_embedding/Adam, loss/save/RestoreV2_293/_571)]]
Caused by op u'loss/save/Assign_293', defined at:
File "main.py", line 62, in <module>
test.test(conf, model)
File "/home1/zhibo/codebase/DAM/bin/test_and_evaluate.py", line 35, in test
_graph = _model.build_graph()
File "/home1/zhibo/codebase/DAM/models/net.py", line 188, in build_graph
self.saver = tf.train.Saver(max_to_keep = self._conf["max_to_keep"])
File "/home/zhibo/anaconda3/envs/DAM/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 1239, in __init__
self.build()
File "/home/zhibo/anaconda3/envs/DAM/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 1248, in build
self._build(self._filename, build_save=True, build_restore=True)
File "/home/zhibo/anaconda3/envs/DAM/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 1284, in _build
build_save=build_save, build_restore=build_restore)
File "/home/zhibo/anaconda3/envs/DAM/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 765, in _build_internal
restore_sequentially, reshape)
File "/home/zhibo/anaconda3/envs/DAM/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 440, in _AddRestoreOps
assign_ops.append(saveable.restore(tensors, shapes))
File "/home/zhibo/anaconda3/envs/DAM/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 160, in restore
self.op.get_shape().is_fully_defined())
File "/home/zhibo/anaconda3/envs/DAM/lib/python2.7/site-packages/tensorflow/python/ops/state_ops.py", line 276, in assign
validate_shape=validate_shape)
File "/home/zhibo/anaconda3/envs/DAM/lib/python2.7/site-packages/tensorflow/python/ops/gen_state_ops.py", line 59, in assign
use_locking=use_locking, name=name)
File "/home/zhibo/anaconda3/envs/DAM/lib/python2.7/site-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper
op_def=op_def)
File "/home/zhibo/anaconda3/envs/DAM/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 3160, in create_op
op_def=op_def)
File "/home/zhibo/anaconda3/envs/DAM/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 1625, in __init__
self._traceback = self._graph._extract_stack() # pylint: disable=protected-access
InvalidArgumentError (see above for traceback): Assign requires shapes of both tensors to match. lhs shape= [434513,200] rhs shape= [172131,200]
[[Node: loss/save/Assign_293 = Assign[T=DT_FLOAT, _class=["loc:@word_embedding"], use_locking=true, validate_shape=true, _device="/job:localhost/replica:0/task:0/device:GPU:0"](loss/word_embedding/Adam, loss/save/RestoreV2_293/_571)]]
conf = {
# "data_path": "./data/ubuntu/data.pkl",
"data_path": "./data/douban/data.pkl",
# "save_path": "./output/ubuntu/temp/",
"save_path": "./output/douban/temp/",
"word_emb_init": "./data/douban/word_embedding.pkl",
"init_model": "./output/douban/DAM/DAM.ckpt", #should be set for test
"rand_seed": None,
"drop_dense": None,
"drop_attention": None,
"is_mask": True,
"is_layer_norm": True,
"is_positional": False,
"stack_num": 5,
"attention_type": "dot",
"learning_rate": 1e-3,
"vocab_size": 434512,
"emb_size": 200,
# "batch_size": 256, #200 for test
"batch_size": 200, #200 for test
"max_turn_num": 9,
"max_turn_len": 50,
"max_to_keep": 1,
"num_scan_data": 2,
# "_EOS_": 28270, #1 for douban data
"_EOS_": 1, #1 for douban data
"final_n_class": 1,
}