python model.py --training=True --name=my_model
I tensorflow/stream_executor/dso_loader.cc:111] successfully opened CUDA library libcublas.so locally
I tensorflow/stream_executor/dso_loader.cc:111] successfully opened CUDA library libcudnn.so locally
I tensorflow/stream_executor/dso_loader.cc:111] successfully opened CUDA library libcufft.so locally
I tensorflow/stream_executor/dso_loader.cc:111] successfully opened CUDA library libcuda.so.1 locally
I tensorflow/stream_executor/dso_loader.cc:111] successfully opened CUDA library libcurand.so locally
Warning -- You have opted to use the orthogonal_initializer function
you have initialized one orthogonal matrix.
you have initialized one orthogonal matrix.
you have initialized one orthogonal matrix.
you have initialized one orthogonal matrix.
Warning -- You have opted to use the orthogonal_initializer function
you have initialized one orthogonal matrix.
you have initialized one orthogonal matrix.
you have initialized one orthogonal matrix.
you have initialized one orthogonal matrix.
WARNING:tensorflow:tf.op_scope(values, name, default_name) is deprecated, use tf.name_scope(name, default_name, values)
WARNING:tensorflow:tf.op_scope(values, name, default_name) is deprecated, use tf.name_scope(name, default_name, values)
/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/gradients.py:90: UserWarning: Converting sparse IndexedSlices to a dense Tensor of unknown shape. This may consume a large amount of memory.
"Converting sparse IndexedSlices to a dense Tensor of unknown shape. "
I tensorflow/core/common_runtime/gpu/gpu_device.cc:951] Found device 0 with properties:
name: GeForce GTX 1080
major: 6 minor: 1 memoryClockRate (GHz) 1.7335
pciBusID 0000:02:00.0
Total memory: 7.92GiB
Free memory: 3.61GiB
I tensorflow/core/common_runtime/gpu/gpu_device.cc:972] DMA: 0
I tensorflow/core/common_runtime/gpu/gpu_device.cc:982] 0: Y
I tensorflow/core/common_runtime/gpu/gpu_device.cc:1041] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GTX 1080, pci bus id: 0000:02:00.0)
WARNING:tensorflow:1:1 : Message type "tensorflow.CheckpointState" has no field named "version".
WARNING:tensorflow:models/my_model/checkpoint: Checkpoint ignored
W tensorflow/core/framework/op_kernel.cc:968] Out of range: RandomShuffleQueue '_4_shuffle_batch/random_shuffle_queue' is closed and has insufficient elements (requested 32, current size 0)
[[Node: shuffle_batch = QueueDequeueMany[_class=["loc:@shuffle_batch/random_shuffle_queue"], component_types=[DT_STRING, DT_STRING, DT_INT64], timeout_ms=-1, _device="/job:localhost/replica:0/task:0/cpu:0"](shuffle_batch/random_shuffle_queue, shuffle_batch/n)]]
W tensorflow/core/framework/op_kernel.cc:968] Out of range: RandomShuffleQueue '_4_shuffle_batch/random_shuffle_queue' is closed and has insufficient elements (requested 32, current size 0)
[[Node: shuffle_batch = QueueDequeueMany[_class=["loc:@shuffle_batch/random_shuffle_queue"], component_types=[DT_STRING, DT_STRING, DT_INT64], timeout_ms=-1, _device="/job:localhost/replica:0/task:0/cpu:0"](shuffle_batch/random_shuffle_queue, shuffle_batch/n)]]
W tensorflow/core/framework/op_kernel.cc:968] Out of range: RandomShuffleQueue '_4_shuffle_batch/random_shuffle_queue' is closed and has insufficient elements (requested 32, current size 0)
[[Node: shuffle_batch = QueueDequeueMany[_class=["loc:@shuffle_batch/random_shuffle_queue"], component_types=[DT_STRING, DT_STRING, DT_INT64], timeout_ms=-1, _device="/job:localhost/replica:0/task:0/cpu:0"](shuffle_batch/random_shuffle_queue, shuffle_batch/n)]]
Done!
Traceback (most recent call last):
File "model.py", line 198, in
main()
File "model.py", line 188, in main
coord.join(threads)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/coordinator.py", line 386, in join
six.reraise(*self._exc_info_to_raise)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/queue_runner.py", line 225, in _run
sess.run(enqueue_op)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 717, in run
run_metadata_ptr)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 915, in _run
feed_dict_string, options, run_metadata)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 965, in _do_run
target_list, options, run_metadata)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 985, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors.DataLossError: corrupted record at 0
[[Node: ReaderRead = ReaderRead[_class=["loc:@TFRecordReader", "loc:@input_producer"], _device="/job:localhost/replica:0/task:0/cpu:0"](TFRecordReader, RefSelect)]]
[[Node: ParseSingleExample/ParseExample/ParseExample/_53 = _Recvclient_terminated=false, recv_device="/job:localhost/replica:0/task:0/gpu:0", send_device="/job:localhost/replica:0/task:0/cpu:0", send_device_incarnation=1, tensor_name="edge_36_ParseSingleExample/ParseExample/ParseExample", tensor_type=DT_INT64, _device="/job:localhost/replica:0/task:0/gpu:0"]]
Caused by op u'ReaderRead', defined at:
File "model.py", line 198, in
main()
File "model.py", line 137, in main
document_batch, document_weights, query_batch, query_weights, answer_batch = read_records(dataset)
File "model.py", line 33, in read_records
_, serialized_example = reader.read(queue)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/io_ops.py", line 267, in read
return gen_io_ops._reader_read(self._reader_ref, queue_ref, name=name)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/gen_io_ops.py", line 176, in _reader_read
queue_handle=queue_handle, name=name)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/op_def_library.py", line 749, in apply_op
op_def=op_def)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 2380, in create_op
original_op=self._default_original_op, op_def=op_def)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 1298, in init
self._traceback = _extract_stack()
DataLossError (see above for traceback): corrupted record at 0
[[Node: ReaderRead = ReaderRead[_class=["loc:@TFRecordReader", "loc:@input_producer"], _device="/job:localhost/replica:0/task:0/cpu:0"](TFRecordReader, RefSelect)]]
[[Node: ParseSingleExample/ParseExample/ParseExample/_53 = _Recvclient_terminated=false, recv_device="/job:localhost/replica:0/task:0/gpu:0", send_device="/job:localhost/replica:0/task:0/cpu:0", send_device_incarnation=1, tensor_name="edge_36_ParseSingleExample/ParseExample/ParseExample", tensor_type=DT_INT64, _device="/job:localhost/replica:0/task:0/gpu:0"]]
rzai@rzai00:~/prj/attention-over-attention$