Giter Club home page Giter Club logo

r-dfpn_fpn_tensorflow's Introduction

Recommend improved code: https://github.com/DetectionTeamUCAS

A Tensorflow implementation of R-DFPN detection framework based on FPN.
Other rotation detection method reference R2CNN, RRPN and R2CNN_HEAD
If useful to you, please star to support my work. Thanks.

Citation

Some relevant achievements based on this code.

@article{[yang2018position](https://ieeexplore.ieee.org/document/8464244),
	title={Position Detection and Direction Prediction for Arbitrary-Oriented Ships via Multitask Rotation Region Convolutional Neural Network},
	author={Yang, Xue and Sun, Hao and Sun, Xian and  Yan, Menglong and Guo, Zhi and Fu, Kun},
	journal={IEEE Access},
	volume={6},
	pages={50839-50849},
	year={2018},
	publisher={IEEE}
}

@article{[yang2018r-dfpn](http://www.mdpi.com/2072-4292/10/1/132),
	title={Automatic ship detection in remote sensing images from google earth of complex scenes based on multiscale rotation dense feature pyramid networks},
	author={Yang, Xue and Sun, Hao and Fu, Kun and Yang, Jirui and Sun, Xian and Yan, Menglong and Guo, Zhi},
	journal={Remote Sensing},
	volume={10},
	number={1},
	pages={132},
	year={2018},
	publisher={Multidisciplinary Digital Publishing Institute}
} 

Configuration Environment

ubuntu(Encoding problems may occur on windows) + python2 + tensorflow1.2 + cv2 + cuda8.0 + GeForce GTX 1080
If you want to use cpu, you need to modify the parameters of NMS and IOU functions use_gpu = False in cfgs.py
You can also use docker environment, command: docker pull yangxue2docker/tensorflow3_gpu_cv2_sshd:v1.0

Installation

Clone the repository

git clone https://github.com/yangxue0827/R-DFPN_FPN_Tensorflow.git    

Make tfrecord

The data is VOC format, reference here
data path format ($DFPN_ROOT/data/io/divide_data.py)

├── VOCdevkit
│   ├── VOCdevkit_train
│       ├── Annotation
│       ├── JPEGImages
│    ├── VOCdevkit_test
│       ├── Annotation
│       ├── JPEGImages
cd $R-DFPN_ROOT/data/io/    
python convert_data_to_tfrecord.py --VOC_dir='***/VOCdevkit/VOCdevkit_train/' --save_name='train' --img_format='.jpg' --dataset='ship'   

Compile

cd $R-DFPN_ROOT/libs/box_utils/
python setup.py build_ext --inplace

Demo

1、Unzip the weight $R-DFPN_ROOT/output/res101_trained_weights/*.rar
2、put images in $R-DFPN_ROOT/tools/inference_image
3、Configure parameters in $R-DFPN_ROOT/libs/configs/cfgs.py and modify the project's root directory
4、image slice

cd $R-DFPN_ROOT/tools
python inference.py    

5、big image

cd $FPN_ROOT/tools
python demo.py --src_folder=.\demo_src --des_folder=.\demo_des   

Train

1、Modify $R-DFPN_ROOT/libs/lable_name_dict/***_dict.py, corresponding to the number of categories in the configuration file
2、download pretrain weight(resnet_v1_101_2016_08_28.tar.gz or resnet_v1_50_2016_08_28.tar.gz) from here, then extract to folder $R-DFPN_ROOT/data/pretrained_weights
3、

cd $R-DFPN_ROOT/tools    
python train.py    

Test tfrecord

cd $R-DFPN_ROOT/tools     
python test.py     

eval(Not recommended, Please refer here)

cd $R-DFPN_ROOT/tools       
python ship_eval.py    

Summary

tensorboard --logdir=$R-DFPN_ROOT/output/res101_summary/     

01 02 03

Graph

04

Test results

11
12

13
14

15
16

17
18

r-dfpn_fpn_tensorflow's People

Contributors

yangxue0827 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

r-dfpn_fpn_tensorflow's Issues

train on DOTA dataset

train on DOTA num_class=15,there is an error
InvalidArgumentError:Assign requires shapes of both tensors to match. lhs shape= [16] rhs shape= [2]
Maybe the pretrained model was trained on your own dataset instead of ImageNet?

I get some trouble in compile

Hi, I had get some trouble when tried to use setup.py to compile 'rbbox_overlaps_kernels.cu'.
It raised :
"Traceback (most recent call last):
File "setup.py", line 59, in
CUDA = locate_cuda()
File "setup.py", line 47, in locate_cuda
raise EnvironmentError('The nvcc binary could not be '
OSError: The nvcc binary could not be located in your $PATH. Either add it to your path, or set $CUDAHOME"
What should I do ?

when i was training my data, the trainging suddenly stoped and the error shows as follows:

Traceback (most recent call last):
File "train.py", line 298, in
train()
File "train.py", line 275, in train
summary_str = sess.run(summary_op)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 789, in run
run_metadata_ptr)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 997, in _run
feed_dict_string, options, run_metadata)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 1132, in _do_run
target_list, options, run_metadata)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 1152, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InvalidArgumentError: input must have at least k columns
[[Node: rpn_losses/TopKV2 = TopKV2[T=DT_FLOAT, sorted=true, _device="/job:localhost/replica:0/task:0/cpu:0"](rpn_losses/strided_slice_2/_4331, rpn_losses/TopKV2/k)]]
[[Node: fast_rcnn_predict/fast_rcnn_proposals/Where/_4155 = _Recvclient_terminated=false, recv_device="/job:localhost/replica:0/task:0/gpu:0", send_device="/job:localhost/replica:0/task:0/cpu:0", send_device_incarnation=1, tensor_name="edge_6861_fast_rcnn_predict/fast_rcnn_proposals/Where", tensor_type=DT_INT64, _device="/job:localhost/replica:0/task:0/gpu:0"]]

Caused by op u'rpn_losses/TopKV2', defined at:
File "train.py", line 298, in
train()
File "train.py", line 93, in train
rpn_location_loss, rpn_classification_loss, rpn_predict_boxes, rpn_predict_scores = rpn.rpn_losses()
File "../libs/rpn/build_rpn.py", line 456, in rpn_losses
top_k_scores, top_k_indices = tf.nn.top_k(minibatch_boxes_softmax_scores[:, 1], k=20)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/nn_ops.py", line 1946, in top_k
return gen_nn_ops._top_kv2(input, k=k, sorted=sorted, name=name)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/gen_nn_ops.py", line 2572, in _top_kv2
name=name)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/op_def_library.py", line 767, in apply_op
op_def=op_def)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 2506, in create_op
original_op=self._default_original_op, op_def=op_def)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 1269, in init
self._traceback = _extract_stack()

InvalidArgumentError (see above for traceback): input must have at least k columns
[[Node: rpn_losses/TopKV2 = TopKV2[T=DT_FLOAT, sorted=true, _device="/job:localhost/replica:0/task:0/cpu:0"](rpn_losses/strided_slice_2/_4331, rpn_losses/TopKV2/k)]]
[[Node: fast_rcnn_predict/fast_rcnn_proposals/Where/_4155 = _Recvclient_terminated=false, recv_device="/job:localhost/replica:0/task:0/gpu:0", send_device="/job:localhost/replica:0/task:0/cpu:0", send_device_incarnation=1, tensor_name="edge_6861_fast_rcnn_predict/fast_rcnn_proposals/Where", tensor_type=DT_INT64, _device="/job:localhost/replica:0/task:0/gpu:0"]]

out of memory

Sorry for bothering you again. When I train it with one 1080 GPU with batchsize of 1. I got the following mistakes. How can I solve it?

2018-05-10 13:42:49: step247692 image_name:000624.jpg |
rpn_loc_loss:0.189756244421 | rpn_cla_loss:0.214562356472 | rpn_total_loss:0.404318600893 |
fast_rcnn_loc_loss:0.0 | fast_rcnn_cla_loss:0.00815858319402 | fast_rcnn_total_loss:0.00815858319402 |
total_loss:1.17546725273 | per_cost_time:0.65540599823s
out of memory
invalid argument
2018-05-10 13:42:53.349625: E tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:639] failed to record completion event; therefore, failed to create inter-stream dependency
2018-05-10 13:42:53.349637: I tensorflow/stream_executor/stream.cc:4138] stream 0x55cd063dc880 did not memcpy host-to-device; source: 0x7fa30b0da010
2018-05-10 13:42:53.349641: E tensorflow/stream_executor/stream.cc:289] Error recording event in stream: error recording CUDA event on stream 0x55cd063dc950: CUDA_ERROR_ILLEGAL_ADDRESS; not marking stream as bad, as the Event object may be at fault. Monitor for further errors.
2018-05-10 13:42:53.349647: E tensorflow/stream_executor/cuda/cuda_event.cc:49] Error polling for event status: failed to query event: CUDA_ERROR_ILLEGAL_ADDRESS
2018-05-10 13:42:53.349650: F tensorflow/core/common_runtime/gpu/gpu_event_mgr.cc:203] Unexpected Event status: 1
an illegal memory access was encountered
an illegal memory access was encountered

OutOfRangeError

请问在训练时出现如下错误可能是什么原因:
OutOfRangeError (see above for traceback): PaddingFIFOQueue '_2_get_batch/batch/padding_fifo_queue' is closed and has insufficient elements (requested 1, current size 0)
[[Node: get_batch/batch = QueueDequeueManyV2[component_types=[DT_STRING, DT_FLOAT, DT_INT32, DT_INT32], timeout_ms=-1, _device="/job:localhost/replica:0/task:0/device:CPU:0"](get_batch/batch/padding_fifo_queue, get_batch/batch/n)]]

CUDA_ERROR_ILLEGAL_ADDRESS error when many objects in one image

Hi,

I encountered the problem of CUDA_ERROR_ILLEGAL_ADDRESS error during training when the objects are densely located, i.e. when there are many(roughly >100 or 150) ships in one 6001000 image. When I split these dense images into smaller ones(control object number < 50) and pad the splitted images into 6001000, the training goes on well with no error. BTW, the same error happens in all your repositories(R2CNN, RRPN, R-DFPN).

But I think split the images with densely located objects may damage the final performance, so I want to get this problem solved. Could you please tell me which part cause this error? Maybe GPU NMS or something else?

wechatimg7

Out of Memory

When I was training my data of 291 epochs or other, some errors may occur randomly. And I saw the GPU memory , at first it was steady, then it suddenly increased and occur the following errors. How to solve the problem? Thank you.

Allocator (GPU_0_bfc) ran out of memory trying to allocate 4.27GiB. Current allocation summary follows.
2019-07-28 14:29:37.479031: I tensorflow/core/common_runtime/bfc_allocator.cc:643] Bin (256): Total Chunks: 4, Chunks in use: 0 1.0KiB allocated for chunks. 19B client-requested for chunks. 0B in use in bin. 0B client-requested in use in bin.
2019-07-28 14:29:37.479056: I tensorflow/core/common_runtime/bfc_allocator.cc:643] Bin (512): Total Chunks: 3, Chunks in use: 0 1.5KiB allocated for chunks. 210B client-requested for chunks. 0B in use in bin. 0B client-requested in use in bin.
2019-07-28 14:29:37.479072: I tensorflow/core/common_runtime/bfc_allocator.cc:643] Bin (1024): Total Chunks: 1, Chunks in use: 0 1.0KiB allocated for chunks. 80B client-requested for chunks. 0B in use in bin. 0B client-requested in use in bin.
...
2019-07-28 14:29:37.541280: I tensorflow/core/common_runtime/bfc_allocator.cc:696] 2 Chunks of size 21607936 totalling 41.21MiB
2019-07-28 14:29:37.541286: I tensorflow/core/common_runtime/bfc_allocator.cc:696] 9 Chunks of size 23040000 totalling 197.75MiB
2019-07-28 14:29:37.541293: I tensorflow/core/common_runtime/bfc_allocator.cc:696] 1 Chunks of size 29791744 totalling 28.41MiB
2019-07-28 14:29:37.541302: I tensorflow/core/common_runtime/bfc_allocator.cc:696] 1 Chunks of size 30857472 totalling 29.43MiB
2019-07-28 14:29:37.541312: I tensorflow/core/common_runtime/bfc_allocator.cc:696] 1 Chunks of size 44332032 totalling 42.28MiB
2019-07-28 14:29:37.541321: I tensorflow/core/common_runtime/bfc_allocator.cc:696] 3 Chunks of size 51380224 totalling 147.00MiB
2019-07-28 14:29:37.541330: I tensorflow/core/common_runtime/bfc_allocator.cc:700] Sum Total of in-use chunks: 1.99GiB
2019-07-28 14:29:37.541345: I tensorflow/core/common_runtime/bfc_allocator.cc:702] Stats:
Limit: 10990990132
InUse: 2133677824
MaxInUse: 6298388480
NumAllocs: 1694962
MaxAllocSize: 4294967296

2019-07-28 14:29:37.541752: W tensorflow/core/common_runtime/bfc_allocator.cc:277] _________________________________________________________________*********************************
2019-07-28 14:29:38.079996: W tensorflow/core/kernels/queue_base.cc:294] _0_get_batch/input_producer: Skipping cancelled enqueue attempt with queue not closed
2019-07-28 14:29:38.080480: W tensorflow/core/kernels/queue_base.cc:294] _1_get_batch/batch/padding_fifo_queue: Skipping cancelled enqueue attempt with queue not closed
2019-07-28 14:29:38.080751: W tensorflow/core/kernels/queue_base.cc:294] _1_get_batch/batch/padding_fifo_queue: Skipping cancelled enqueue attempt with queue not closed
2019-07-28 14:29:38.080771: W tensorflow/core/kernels/queue_base.cc:294] _1_get_batch/batch/padding_fifo_queue: Skipping cancelled enqueue attempt with queue not closed
2019-07-28 14:29:38.080789: W tensorflow/core/kernels/queue_base.cc:294] _1_get_batch/batch/padding_fifo_queue: Skipping cancelled enqueue attempt with queue not closed
2019-07-28 14:29:38.080805: W tensorflow/core/kernels/queue_base.cc:294] _1_get_batch/batch/padding_fifo_queue: Skipping cancelled enqueue attempt with queue not closed
2019-07-28 14:29:38.080858: W tensorflow/core/kernels/queue_base.cc:294] _1_get_batch/batch/padding_fifo_queue: Skipping cancelled enqueue attempt with queue not closed
2019-07-28 14:29:38.080873: W tensorflow/core/kernels/queue_base.cc:294] _1_get_batch/batch/padding_fifo_queue: Skipping cancelled enqueue attempt with queue not closed
2019-07-28 14:29:38.080886: W tensorflow/core/kernels/queue_base.cc:294] _1_get_batch/batch/padding_fifo_queue: Skipping cancelled enqueue attempt with queue not closed
2019-07-28 14:29:38.080937: W tensorflow/core/kernels/queue_base.cc:294] _1_get_batch/batch/padding_fifo_queue: Skipping cancelled enqueue attempt with queue not closed
2019-07-28 14:29:38.080997: W tensorflow/core/kernels/queue_base.cc:294] _1_get_batch/batch/padding_fifo_queue: Skipping cancelled enqueue attempt with queue not closed
2019-07-28 14:29:38.081011: W tensorflow/core/kernels/queue_base.cc:294] _1_get_batch/batch/padding_fifo_queue: Skipping cancelled enqueue attempt with queue not closed
2019-07-28 14:29:38.081027: W tensorflow/core/kernels/queue_base.cc:294] _1_get_batch/batch/padding_fifo_queue: Skipping cancelled enqueue attempt with queue not closed
2019-07-28 14:29:38.081041: W tensorflow/core/kernels/queue_base.cc:294] _1_get_batch/batch/padding_fifo_queue: Skipping cancelled enqueue attempt with queue not closed
2019-07-28 14:29:38.081054: W tensorflow/core/kernels/queue_base.cc:294] _1_get_batch/batch/padding_fifo_queue: Skipping cancelled enqueue attempt with queue not closed
2019-07-28 14:29:38.081067: W tensorflow/core/kernels/queue_base.cc:294] _1_get_batch/batch/padding_fifo_queue: Skipping cancelled enqueue attempt with queue not closed
2019-07-28 14:29:38.081078: W tensorflow/core/kernels/queue_base.cc:294] _1_get_batch/batch/padding_fifo_queue: Skipping cancelled enqueue attempt with queue not closed
Traceback (most recent call last):
File "train.py", line 299, in
train()
File "train.py", line 260, in train
fast_rcnn_total_loss, total_loss, train_op])
File "/home/hbk/miniconda3/envs/mytensorflow/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 789, in run
run_metadata_ptr)
File "/home/hbk/miniconda3/envs/mytensorflow/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 997, in _run
feed_dict_string, options, run_metadata)
File "/home/hbk/miniconda3/envs/mytensorflow/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1132, in _do_run
target_list, options, run_metadata)
File "/home/hbk/miniconda3/envs/mytensorflow/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1152, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InternalError: Dst tensor is not initialized.
[[Node: rpn_losses/rpn_minibatch/rpn_find_positive_negative_samples/PyFunc/_3605 = _Recvclient_terminated=false, recv_device="/job:localhost/replica:0/task:0/gpu:0", send_device="/job:localhost/replica:0/task:0/cpu:0", send_device_incarnation=1, tensor_name="edge_20948_rpn_losses/rpn_minibatch/rpn_find_positive_negative_samples/PyFunc", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/gpu:0"]]

convert_data_to_tfrecord FAIL

While train the tf_record by convert_data_to_tfrecord.py, I met this problem.yangJirui/RRPN_FPN_Tensorflow#1.

And I saw segmentation fault, but I dont know how to modify.

It is shown, one xml.

This XML file does not appear to have any style information associated with it. The document tree is shown below.

VOC2007
20180718003NG.bmp

4112
2176
3

0

circle
Unknown
0
0

2813
72
3313
282
3225
491
2725
281



circle
Unknown
0
0

3521
1156
3696
1229
3484
1736
3308
1662



circle
Unknown
0
0

2140
1742
2597
1897
2530
2096
2072
1940



circle
Unknown
0
0

1870
558
2087
619
1955
1089
1738
1028


What prolem in this xml file?
Could u help me plz. Thank you in advance. @yangxue0827

ImportError: dynamic module does not define module export function (PyInit_rbbox_overlaps)

When I running python train.py,I encountered this error message:

Traceback (most recent call last):
  File "train.py", line 16, in <module>
    from libs.rpn import build_rpn
  File "../libs/rpn/build_rpn.py", line 10, in <module>
    from libs.box_utils import iou_rotate
  File "../libs/box_utils/iou_rotate.py", line 10, in <module>
    from libs.box_utils.rbbox_overlaps import rbbx_overlaps
ImportError: dynamic module does not define module export function (PyInit_rbbox_overlaps)

Need I compile this cpp file again ?

env: centos7 +tensorflow1.8+cuda8.0+python3

问一下 如何定义 斜框中的 x0,y0,x1,y1,x2,y2,x3,y3?

我自己在做数据库,标了一张图

。(x0 x1
。。 ====
。。。 = 。 =
。。。。 ====
。。。。。。。(x2 x3

然后我想数据增强 我就旋转他 可当我定义坐标的时候

原来x0就在x1的下面了 或者 在x3的位置
这样 标注的顺序就要变了 是吗?
更复杂的情况就是 正方形的一个角在正上方 然后沿着y轴对称 这样x0 是哪个?

还是 只需要在意顺时针的坐标顺序 不需要在意哪个坐标开始?

How to make the annotation?

Hi, thanks for your work. I want to annotate my own data with 8 parameters rather than the 4 parameters(PASCAL VOC uses the labelimg). Is there tools to do this?
In your paper, you use five parameters(x,y,w,h,theta),but the sample.xml you give has eight parameters. Can you explain to me ?Thanks a lot.

the problem about the version of cuda

Thanks for your share. Great job!

I've got one question. Is it really necessary to use tensorflow 1.2 and cuda 8?

I mean the versions of tensorflow and cuda are both higher right now. The combination of my machine is tensorflow 1.13 and cuda 9.1, which can work out in other project. When I run the demo in this project, the result shows "libcudart.so.8.0: cannot open shared object file: No such file or directory". So I guess cuda 8 has been constrained in the program, am I right ?

in rpn_conv2d_3x3, the kernel_size is 5.

In build_rpn.py, the rpn_conv2d_3x3 layer's kernel_size = [5, 5]? and stride = 1,so the output size will be smaller i think. is there any problems? maybe what i thought is wrong, i am not skillful to tf.
In the FPN paper, kernel size = 3, and the output size is the same as the input.

EnvironmentError: The nvcc binary could not be located in your $PATH. Either add it to your path, or set $CUDAHOME

The system environment used:
VMware Workstation 15 Pro + Ubuntu18.0 + Python2.7 + Tensorflow1.2.1 + opencv-python 4.2.0
I'd like to install R-DFPN_FPN_Tensorflow in CPU ONLY mode. Is this possible? Which configuration files should I modify? Can I get rid of cude software?
Here is a list of the files That I have modified:
$R-DFPN_ROOT//libs/configs/cfgs.py(modify the parameters of NMS and IOU functions use_gpu = False)
python setup.py build_ext --inplace
I got:
python setup.py build_ext --inplace
Traceback (most recent call last):
File "setup.py", line 59, in
CUDA = locate_cuda()
File "setup.py", line 47, in locate_cuda
raise EnvironmentError('The nvcc binary could not be '
EnvironmentError: The nvcc binary could not be located in your $PATH. Either add it to your path, or set $CUDAHOME

InvalidArgumentError (see above for traceback): input must have at least k columns

When I use SAR images, I got the following error. Can you help me to solve it?
InvalidArgumentError (see above for traceback): input must have at least k columns
[[Node: rpn_losses/TopKV2 = TopKV2[T=DT_FLOAT, sorted=true, _device="/job:localhost/replica:0/task:0/cpu:0"](rpn_losses/strided_slice_2/_4989, rpn_losses/TopKV2/k)]]
[[Node: fast_rcnn_predict/fast_rcnn_proposals/cond/Slice/_4787 = _Recvclient_terminated=false, recv_device="/job:localhost/replica:0/task:0/gpu:0", send_device="/job:localhost/replica:0/task:0/cpu:0", send_device_incarnation=1, tensor_name="edge_7827_fast_rcnn_predict/fast_rcnn_proposals/cond/Slice", tensor_type=DT_INT64, _device="/job:localhost/replica:0/task:0/gpu:0"]]

Process finished with exit code 1

fail to train

2018-09-09 20:40:36: step1120 image_name:20180727008OK.bmp |
rpn_loc_loss:0.0507980324328 | rpn_cla_loss:0.00724989641458 | rpn_total_loss:0.058047927916 |
fast_rcnn_loc_loss:0.0158959124237 | fast_rcnn_cla_loss:0.00159495836124 | fast_rcnn_total_loss:0.0174908712506 |
total_loss:0.835565567017 | per_cost_time:1.12856292725s
Traceback (most recent call last):
File "train.py", line 298, in
train()
File "train.py", line 259, in train
fast_rcnn_total_loss, total_loss, train_op])
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 900, in run
run_metadata_ptr)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 1135, in _run
feed_dict_tensor, options, run_metadata)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 1316, in _do_run
run_metadata)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 1335, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.OutOfRangeError: PaddingFIFOQueue '_1_get_batch/batch/padding_fifo_queue' is closed and has insufficient elements (requested 1, current size 0)
[[Node: get_batch/batch = QueueDequeueManyV2[component_types=[DT_STRING, DT_FLOAT, DT_INT32, DT_INT32], timeout_ms=-1, _device="/job:localhost/replica:0/task:0/device:CPU:0"](get_batch/batch/padding_fifo_queue, get_batch/batch/n)]]

Caused by op u'get_batch/batch', defined at:
File "train.py", line 298, in
train()
File "train.py", line 36, in train
is_training=True)
File "/home/max/R-DFPN_FPN_Tensorflow/data/io/read_tfrecord.py", line 85, in next_batch
dynamic_pad=True)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/input.py", line 988, in batch
name=name)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/input.py", line 762, in _batch
dequeued = queue.dequeue_many(batch_size, name=name)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/data_flow_ops.py", line 483, in dequeue_many
self._queue_ref, n=n, component_types=self._dtypes, name=name)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/gen_data_flow_ops.py", line 3480, in queue_dequeue_many_v2
component_types=component_types, timeout_ms=timeout_ms, name=name)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper
op_def=op_def)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 3392, in create_op
op_def=op_def)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 1718, in init
self._traceback = self._graph._extract_stack() # pylint: disable=protected-access

OutOfRangeError (see above for traceback): PaddingFIFOQueue '_1_get_batch/batch/padding_fifo_queue' is closed and has insufficient elements (requested 1, current size 0)
[[Node: get_batch/batch = QueueDequeueManyV2[component_types=[DT_STRING, DT_FLOAT, DT_INT32, DT_INT32], timeout_ms=-1, _device="/job:localhost/replica:0/task:0/device:CPU:0"](get_batch/batch/padding_fifo_queue, get_batch/batch/n)]]

I convert tfrecord successfully without segementation fault. But the problem appears as above.

What should I modify, thank you so much for ur help in advance. @yangxue0827

OutOfRangeError PaddingFIFOQueue

Traceback (most recent call last):
File "train.py", line 298, in
train()
File "train.py", line 259, in train
fast_rcnn_total_loss, total_loss, train_op])
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 789, in run
run_metadata_ptr)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 997, in _run
feed_dict_string, options, run_metadata)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 1132, in _do_run
target_list, options, run_metadata)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 1152, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.OutOfRangeError: PaddingFIFOQueue '_1_get_batch/batch/padding_fifo_queue' is closed and has insufficient elements (requested 1, current size 0)
[[Node: get_batch/batch = QueueDequeueManyV2[component_types=[DT_STRING, DT_FLOAT, DT_INT32, DT_INT32], timeout_ms=-1, _device="/job:localhost/replica:0/task:0/cpu:0"](get_batch/batch/padding_fifo_queue, get_batch/batch/n)]]

Caused by op u'get_batch/batch', defined at:
File "train.py", line 298, in
train()
File "train.py", line 36, in train
is_training=True)
File "../data/io/read_tfrecord.py", line 85, in next_batch
dynamic_pad=True)
hello, my question is as follows:

OutOfRangeError (see above for traceback): PaddingFIFOQueue '_1_get_batch/batch/padding_fifo_queue' is closed and has insufficient elements (requested 1, current size 0)
[[Node: get_batch/batch = QueueDequeueManyV2[component_types=[DT_STRING, DT_FLOAT, DT_INT32, DT_INT32], timeout_ms=-1, _device="/job:localhost/replica:0/task:0/cpu:0"](get_batch/batch/padding_fifo_queue, get_batch/batch/n)]]

did anyone met the same problem?

OutOfRangeError

我在用test.py测试时遇到如下错误:
OutOfRangeError (see above for traceback): PaddingFIFOQueue '_2_batch/padding_fifo_queue' is closed and has insufficient elements (requested 1, current size 0)
[[Node: batch = QueueDequeueManyV2[component_types=[DT_STRING, DT_FLOAT, DT_INT32, DT_INT32], timeout_ms=-1, _device="/job:localhost/replica:0/task:0/device:CPU:0"](batch/padding_fifo_queue, batch/n)]]
我觉得是我制作的tfcord数据事用的xml数据出了问题,在转数据时报了如下的错误:
Conversion progress:[>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>]100% 110/110
Conversion is complete!
段错误 (核心已转储)
但是在测试时,数据的读取是随机的,数据量非常大的话去人工找xml文件太耗费时间了,请问有什么办法能将数据的读取改为顺序呢

Missing libs

Hi,

I very appreciate your efforts for sharing the nice implementation. In this repository, I could not find some libs including configs, fast_rcnn, rpn ..., which are necessary to run your code.

Do you have a plan to update this repository as a complete one?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.