Giter Club home page Giter Club logo

yolo_tensorflow's Introduction

YOLO_tensorflow

Tensorflow implementation of YOLO, including training and test phase.

Installation

  1. Clone yolo_tensorflow repository

    $ git clone https://github.com/hizhangp/yolo_tensorflow.git
    $ cd yolo_tensorflow
  2. Download Pascal VOC dataset, and create correct directories

    $ ./download_data.sh
  3. Download YOLO_small weight file and put it in data/weight

  4. Modify configuration in yolo/config.py

  5. Training

    $ python train.py
  6. Test

    $ python test.py

Requirements

  1. Tensorflow

  2. OpenCV

yolo_tensorflow's People

Contributors

ck196 avatar hizhangp avatar robrao avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

yolo_tensorflow's Issues

minor typo in train.py

Hi,

I spot a minor typo in train.py line 110, 111. I think the closing bracket is misplaced.

Best

[question] the reason of transpose?

Hi, deeply thanks for sharing your work!

I found that there is a transpose in your code:
net = tf.transpose(net, [0, 3, 1, 2], name='trans_31')

This is where a conv layer gets flattened before fc layers. May I ask why transpose is there? Is this a standard way to do so? The YOLO paper recommends doing so?

Help!

When I trained the network ,there was a value error -"uknown code 'f' for object of type 'str'" in train.py, and the program stoped in line of 'train_timer.remain(step, self.max_iter))'. Why?

Some problems occurred in testing

Hi, im interested in your amazing project.
I've trained a model followed by your steps. But when i tried to test the model.
It seems that nothing was detected in the picture.
However, run the test.py with YOLO_small.ckpt is OK. But I cannot detect anything with save.ckpt-15000 etc.
i changed nothing in your train.py. could you tell me what's possibly going wrong?

Load pretrained model YOLO_small.ckpt error

When i restored the YOLO_small.ckpt,there was an error: "Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?"

As we know, tensorflow saves model into four files including checkpoint, index, data and meta, why this checkpoint only has one file, how can i use it?
Thanks for anyone's help.

By the way, when i trained the model in Pascal VOC 2007 without using the pre-trained model, the loss couldn't decrease and changed between 8 and 12.

Results never changed

Whatever photos I put , the net_ouput will be the same.And the final image showed last have the same frame.I am upset about this phenomenon.
Hope someone could tell me any advice.
Thank you.

def detect_from_cvmat(self, inputs):
net_output = self.sess.run(self.net.logits,
feed_dict={self.net.images: inputs})

Error occurs in training

AttributeError: 'YOLONet' object has no attribute 'loss'?
I check YONONet class and there is no 'loss' actually? is it wrong?

Why the training makes no efforts?

Dear all:
I have been training this network thousands of iterations, but I nearly cannot find any change about the loss value. So what's wrong? Does anyone have validated its validity?
yolo
yolo2

InvalidArgumentError (see above for traceback): Assign requires shapes of both tensors to match. lhs shape= [539] rhs shape= [1470]

InvalidArgumentError (see above for traceback): Assign requires shapes of both tensors to match. lhs shape= [539] rhs shape= [1470]
[[Node: save/Assign_52 = Assign[T=DT_FLOAT, _class=["loc:@yolo/fc_36/biases"], use_locking=true, validate_shape=true, _device="/job:localhost/replica:0/task:0/gpu:0"](yolo/fc_36/biases, save/RestoreV2_52/_5)]]

I trained my own data set and only one class has this problem

Running on python3.x

hi again!
recently I've spent time reading YOLO papers and trying to understand your codes.
now i think it's time to fine-tune the yolo and then make some modification on the basic structure of yolo.

but soon after I managed to prepare my own datasets from IMAGENET, I got another issue regarding python version.. I'm using python3.x and i have absolutely no idea about python2 or compatibility issue..

now i decided to first convert your codes to the codes that can be used in python3.x and then use them.
this weekend i'm just gonna convert all the python2-type print to python3-type print function and then i will execute the codes.
ex) print 'Hello world!' ----> print('Hello World')

so.. can you tell me what kind of other issues i have to consider except for print function issues when py2.x -> py3.x ?

thanks in advance!

I have a problem about test.py

NotFoundError (see above for traceback): Tensor name "yolo/conv_2/biases" not found in checkpoint files data/weights/YOLO_small.ckpt

error loading the weight file (successfully trained with own dataset and saved)when doing the test

I have successfully trained the model with my own dataset(which have two classes)and saved the weight file. But when I loaded the saved model to conduct testing, something wrong with the weight file

/core/util/tensor_slice_reader.cc:95] Could not open ./data/weights/save.ckpt: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?

The error appears in the line:

self.saver.restore(self.sess, self.weights_file)

The version of tensorflow I am currently using is 1.2. I'm not sure if it's the version problem(I have searched on the Internet for this error and found the same problems related to the version of tensorflow ). Could you help me with this issue and how to restore weight files which have been trained?

Problems About model testing

I‘ve tried to use test.py to test the model trained by train.py. I rename the model to save.ckpt and use it as the weights to run the test.py. But it seems that test.py cannot open that file. Am i wrong in testing methods?

No training?help

Hello, why hasn't my loss changed from the beginning to the end of training? Do you know what the reason is? Thank you very much!

yolo v2

  • Did you implement yolo v2 in this project? If it is not, do you have a plan to do it?

Thank you very much.

train my own pictures

if i want to train this network on my own pictures, should i make my data the same format as VOC?

training time issue

Hi I'm a beginner in this deep-learning field and now I'm trying to fine-tune some of the well-known object detectors such as Yolo. first I want to train Yolo to classify a single class(person or none) and also to find bounding boxes of multiple objects(people) in an image or in videos

I think I can use just the PASCAL VOC data sets for fine-tuning. and I have a laptop with i5-6200U, Geforce 940M. I know its not such a good environment and condition especially for training CNNs.

so.. how do you expect the duration time for fine-tuning this Yolo, if I use roughly 1000 images?

thanks in advance

p.s
sorry for my bad english and i hope you understood my intention!

Is there something wrong with the loss function?

predict_boxes_tran = tf.stack(
[(predict_boxes[..., 0] + offset) / self.cell_size,
(predict_boxes[..., 1] + offset_tran) / self.cell_size,
tf.square(predict_boxes[..., 2]),
tf.square(predict_boxes[..., 3])], axis=-1)

Why do you take the square of predicted h and w here?

And in
boxes_delta = coord_mask * (predict_boxes - boxes_tran)

You don't take the square root of predicted h and w.

So the output W and H of the network has already been taken root?

help

HOW to train my own dataset?I need to use my image to detect the defects,so I test the VOC2007, it works.But when I change the images the Keyerror happens.In addition, CAN the model in the program be changed?or use a simpler model?please connect with me,[email protected]

Test IOU

I have run test.py successfully, and got a visualization picture of person.jpg. And I also want to get the iou on VOC2007 test dataset. How could I do? Did anyone get the test iou on VOC2007 test dataset? Can you tell me ? Thank you.

python test.py

Hi, anyone can help

when I try python test.py I get the following error

abdulrahman@abdulrahman-ThinkPad-X230-Tablet:~/yolo_tensorflow$ python test.py
Restoring weights from: data/weights/YOLO_small.ckpt
Average detecting time: 0.608s
Traceback (most recent call last):
File "test.py", line 190, in
main()
File "test.py", line 186, in main
detector.image_detector(imname)
File "test.py", line 161, in image_detector
self.draw_result(image, result)
File "test.py", line 43, in draw_result
cv2.putText(img, result[i][0] + ' : %.2f' % result[i][5], (x - w + 5, y - h - 7), cv2.FONT_HERSHEY_SIMPLEX, 0.5, (0, 0, 0), 1, cv2.CV_AA)
AttributeError: 'module' object has no attribute 'CV_AA'

Training with gpu options

Hi,
I'm trying to run train.py with gpu options with
$ python train.py --gpu 0
it runs without errors but it takes too much time that it shows no improvements in speed of computation compared to the one with no gpu option
I've installed cuda without problems and other cuda-based applications run fine.
I printed out device placements with log_device_placement option and found out everything were run
on CPU
can anyone tell me why this happens?

Train my own dataset

Hi, I trained my own dataset with the yolo_tensorflow, but get the following error:
InvalidArgumentError (see above for traceback): Assign requires shapes of both tensors to match. lhs shape= [539] rhs shape= [1470] [[Node: save/Assign_49 = Assign[T=DT_FLOAT, _class=["loc:@Variable_53"], use_locking=true, validate_shape=true, _device="/job:localhost/replica:0/task:0/gpu:0"](Variable_53, save/RestoreV2_49/_1)]]
I don't know how to do with this error

How train the yolo networks without pre-train weight(don't use pre-train weights)

I try to train the yolo networks, and don't use pre-train weight, but I get some error.
that is my process:

  1. changed '\yolo\config' WEIGHTS_FILE = os.path.join(DATA_PATH, 'weights', 'YOLO_small.ckpt')
    to WEIGHTS_FILE = None
  2. 'parser.add_argument('--weights', default="YOLO_small.ckpt", type=str)' was comment out, because I want train yolo network without pre-train weights
  3. 'cfg.WEIGHTS_FILE = os.path.join(cfg.WEIGHTS_DIR, args.weights)' was comment out, the reason is same as above

Please tell how, wo know my way is stupid, but I still want to try. Thx.

UnboundLocalError: local variable 'frame' referenced before assignment

python test.py
Restoring weights from: data/weights/YOLO_small.ckpt
Traceback (most recent call last):
File "test.py", line 183, in
main()
File "test.py", line 175, in main
detector.camera_detector(cap)
File "test.py", line 140, in camera_detector
result = self.detect(frame)
UnboundLocalError: local variable 'frame' referenced before assignment

OS: Ubuntu 16.04.1 (64 Bits)
Python 2.7.12
Python 2.7.12 (default, Nov 19 2016, 06:48:10)
[GCC 5.4.0 20160609] on linux2
Type "help", "copyright", "credits" or "license" for more information.

import cv2
cv2.version
'2.4.13'
OpenCV version 2.4.13
pip 9.0.1

pip list:

appdirs (1.4.2)
apturl (0.5.2)
asgi-redis (1.0.0)
asgiref (1.0.0)
autobahn (0.17.1)
beautifulsoup4 (4.4.1)
bleach (1.5.0)
blinker (1.3)
Brlapi (0.6.4)
channels (1.0.3)
chardet (2.3.0)
checkbox-support (0.22)
command-not-found (0.3)
constantly (15.1.0)
cryptography (1.2.3)
daphne (1.0.2)
decorator (4.0.11)
defer (1.0.6)
Django (1.10.5)
entrypoints (0.2.2)
feedparser (5.1.3)
guacamole (0.9.2)
html5lib (0.9999999)
httplib2 (0.9.1)
idna (2.0)
incremental (16.10.1)
ipykernel (4.5.2)
ipython (5.2.2)
ipython-genutils (0.1.0)
ipywidgets (5.2.2)
Jinja2 (2.9.5)
jsonschema (2.6.0)
jupyter (1.0.0)
jupyter-client (4.4.0)
jupyter-console (5.1.0)
jupyter-core (4.2.1)
language-selector (0.1)
louis (2.6.4)
lxml (3.5.0)
Mako (1.0.3)
MarkupSafe (0.23)
mistune (0.7.3)
msgpack-python (0.4.8)
nbconvert (5.1.1)
nbformat (4.2.0)
nltk (3.2.2)
notebook (4.3.2)
numpy (1.12.0)
oauthlib (1.0.3)
onboard (1.2.0)
packaging (16.8)
padme (1.1.1)
pandas (0.19.2)
pandocfilters (1.4.1)
pexpect (4.2.1)
pickleshare (0.7.4)
Pillow (3.1.2)
pip (9.0.1)
plainbox (0.25)
prompt-toolkit (1.0.13)
protobuf (3.2.0)
ptyprocess (0.5.1)
pyasn1 (0.1.9)
pycups (1.9.73)
pycurl (7.43.0)
Pygments (2.2.0)
pygobject (3.20.0)
PyJWT (1.3.0)
pyparsing (2.1.10)
python-apt (1.1.0b1)
python-dateutil (2.6.0)
python-debian (0.1.27)
python-systemd (231)
pytz (2016.10)
pyxdg (0.25)
pyzmq (16.0.2)
qtconsole (4.2.1)
redis (2.10.5)
reportlab (3.3.0)
requests (2.9.1)
scikit-learn (0.18.1)
scipy (0.18.1)
screen-resolution-extra (0.0.0)
sessioninstaller (0.0.0)
setuptools (34.3.0)
simplegeneric (0.8.1)
six (1.10.0)
system-service (0.3)
tensorflow (1.0.0)
terminado (0.6)
testpath (0.3)
tornado (4.4.2)
tqdm (4.11.2)
traitlets (4.3.1)
Twisted (16.6.0)
txaio (2.6.0)
ubuntu-drivers-common (0.0.0)
ufw (0.35)
unattended-upgrades (0.1)
unity-scope-calculator (0.1)
unity-scope-chromiumbookmarks (0.1)
unity-scope-colourlovers (0.1)
unity-scope-devhelp (0.1)
unity-scope-firefoxbookmarks (0.1)
unity-scope-gdrive (0.7)
unity-scope-manpages (0.1)
unity-scope-openclipart (0.1)
unity-scope-texdoc (0.1)
unity-scope-tomboy (0.1)
unity-scope-virtualbox (0.1)
unity-scope-yelp (0.1)
unity-scope-zotero (0.1)
urllib3 (1.13.1)
usb-creator (0.3.0)
wcwidth (0.1.7)
wheel (0.29.0)
widgetsnbextension (1.2.6)
xdiagnose (3.8.4.1)
xkit (0.0.0)
XlsxWriter (0.7.3)
zope.interface (4.3.3)

ValueError: Unknown format code 'f' for object of type 'str'

2018-03-31 21:29:02.437666: I tensorflow/core/common_runtime/gpu/pool_allocator.cc:247] PoolAllocator: After 10024 get requests, put_count=10022 evicted_count=1000 eviction_rate=0.0997805 and unsatisfied allocation rate=0.109936
2018-03-31 21:29:02.437717: I tensorflow/core/common_runtime/gpu/pool_allocator.cc:259] Raising pool_size_limit_ from 100 to 110
Traceback (most recent call last):
File "train.py", line 165, in
main()
File "train.py", line 158, in main
solver.train()
File "train.py", line 90, in train
train_timer.remain(step, self.max_iter))
ValueError: Unknown format code 'f' for object of type 'str'

Some places in loss_layer are different from the paper

image

  1. The loss equation describes the confidence loss of no object is " C-C' " , but in the code, the loss is "noobject_delta = noobject_mask * predict_scales"

  2. λnoobj are different λclass are different from the paper.

yolo config file

Hi,
I wonder how should i edit the config file.
What parameters should I make adjustment?

Two question of three lines of codes.

image = cv2.resize(image, (self.image_size, self.image_size))
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB).astype(np.float32)
image = (image / 255.0) * 2.0 - 1.0)

I have two questions for these three lines of codes.

  1. After 'resize' operation, why do we need to use the cvColor function?

  2. What's the function of the third code?

I want to train my own data, but my sample's resolution is too low. After i do these tree operation, the result is terrible. So i just want to resize my sample without the last two operation. But after I read your code, I don't know if there are any influence if I remove them.(I am not a native English speaker, thanks for your answer.)

Strange phenomenon during my experiment

@hizhangp I have use the pascal_voc2007 dataset to train the network, and try to use the trained network to do some prediction. I found that whatever image feed to the network, It always generate the same output(both bonding-box and confidence in different image are the same), through the total_loss is quite low.
So I review the code, and try to understand what have happened during the training progress, and There are two place is really hard for me to get it through, they both in "loss_layer"
First of all, “boxes = tf.tile(boxes, [1, 1, 1, self.boxes_per_cell, 1]) / self.image_size”, why the "boxes" needed to be divided by "image_size" , when the predict_boxes_train only divided by cell_size
Secondly, why the predict_box should be plused with "offset" when we calculate the predict_boxes_train,?can we remove it?

Is there any mistake in network architecture ?

I notice the description in YOLO paper that YOLO only have 2 fc layers. However, there are 3 fc layers in this code, in which the 512 fc layer don't exist in YOLO paper. I doubt whether this code do it right way.

By the way, can you tell me how do you obtain the pretrained model? I mean how is YOLO_small.ckpt generated, for YOLO source code is a C++ code and use darknet.

Question on yolo backward function

Hi there, sorry about that to post this here, i was trying to write yolo in torch but struggled the backward function for so many days, but the gradients are always exploding slowly, could you kindly shed a light on my codes? Thank you so much.

gradInput[{ {}, {}, 1, {}, {} }] = self.mse:backward(torch.cmul(self.x_buffer, x, coord_mask), tx)
gradInput[{ {}, {}, 2, {}, {} }] = self.mse:backward(torch.cmul(self.y_buffer, y, coord_mask), ty)
gradInput[{ {}, {}, 3, {}, {} }] = self.mse:backward(torch.cmul(self.w_buffer, w, coord_mask), tw)
gradInput[{ {}, {}, 4, {}, {} }] = self.mse:backward(torch.cmul(self.h_buffer, h, coord_mask), th)
gradInput[{ {}, {}, 5, {}, {} }] = self.mse:backward(torch.cmul(self.conf_buffer, conf, coord_mask), tconf)
gradInput[{ {}, {}, { 6, 5 + nC }, {}, {} }][self.cls_mask] = self.ce:backward(torch.cmul(self.cls_buffer, cls), tcls)

why add tf.pad operator before slim.conv2d?

@hizhangp
net = tf.pad(images, np.array([[0, 0], [3, 3], [3, 3], [0, 0]]), name='pad_1')
net = slim.conv2d(net, 64, 7, 2, padding='VALID', scope='conv_2')
I don't understand why the tf.pad was added before slim.conv2d, furthermore,it expands the size of the network。

weight file which made by train.py was not worked

in config.py
i modified just

WEIGHTS_FILE = None
BATCH_SIZE = 5

BUT!!
after training abt 3hours,
i did test.py, but there are no box on imname = 'test/cat.jpg'
..
what is the problem?? I just want to get my own weights file by my training

when i train with WEIGHTS_FILE = os.path.join(DATA_PATH, 'weights', 'YOLO_small.ckpt')
it worked well

One question of the output tensor of yolo.

Dear author! I want to know how can i get this number. According to the way in yolo's paper, the output tensor of yolo could be [7x7x(2x5+20)], thus [7x7x30], but i found the output of your code is 7x7x25. So i want to know is there any different?(Forgive for my grammar errors, I am not a native English speaker.)
Uploading Screenshot from 2018-04-11 11-33-04.png…
Uploading Screenshot from 2018-04-11 11-33-46.png…

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.