lars76 / object-localization Goto Github PK

View Code? Open in Web Editor NEW

137.0 6.0 60.0 39.97 MB

Object localization in images using simple CNNs and Keras

License: MIT License

Python 100.00%

convolutional-neural-network cnn cnn-keras keras object-detection object-localisation mobilenet mobilenetv2

object-localization's Introduction

object-localization

This project shows how to localize objects in images by using simple convolutional neural networks.

Dataset

Before getting started, we have to download a dataset and generate a csv file containing the annotations (boxes).

Download The Oxford-IIIT Pet Dataset
Download The Oxford-IIIT Pet Dataset Annotations
tar xf images.tar.gz
tar xf annotations.tar.gz
mv annotations/xmls/* images/
python3 generate_dataset.py

Single-object detection

Example 1: Finding dogs/cats

Architecture

First, let's look at YOLOv2's approach:

Pretrain Darknet-19 on ImageNet (feature extractor)
Remove the last convolutional layer
Add three 3 x 3 convolutional layers with 1024 filters
Add a 1 x 1 convolutional layer with the number of outputs needed for detection

We proceed in the same way to build the object detector:

Choose a model from Keras Applications i.e. feature extractor
Remove the dense layer
Freeze some/all/no layers
Add one/multiple/no convolution block (or _inverted_res_block for MobileNetv2)
Add a convolution layer for the coordinates

The code in this repository uses MobileNetv2, because it is faster than other models and the performance can be adapted. For example, if alpha = 0.35 with 96x96 is not good enough, one can just increase both values (see here for a comparison). If you use another architecture, change preprocess_input.

python3 example_1/train.py
Adjust the WEIGHTS_FILE in example_1/test.py (given by the last script)
python3 example_1/test.py

Result

In the following images red is the predicted box, green is the ground truth:

Example 2: Finding dogs/cats and distinguishing classes

This time we have to run the scripts example_2/train.py and example_2/test.py.

Changes

In order to distinguish between classes, we have to modify the loss function. I'm using here w_1*log((y_hat - y)^2 + 1) + w_2*FL(p_hat, p) where w_1 = w_2 = 1 are two weights and FL(p_hat, p) = -(0.9(1 - p_hat)^2 p*log(p_hat) + 0.1*p_hat^2(1 - p)log(1-p_hat)) (focal loss).

Instead of using all 37 classes, the code will only output class 0 (contains only class 0) or class 1 (contains class 1 to 36). However, it is easy to extend this to more classes (use categorical cross entropy instead of focal loss and try out different weights).

Multi-object detection

Example 3: Segmentation-like detection

Architecture

In this example, we use a skip-net architecture similar to U-Net. For an in-depth explanation see my blog post.

Result

Example 4: YOLO-like detection

Architecture

This example is based on the three YOLO papers. For an in-depth explanation see this blog post.

Result

Guidelines

Improve accuracy (IoU)

enable augmentations: see example_4 the same code can be added to the other examples
better augmentations: try out different values (flips, rotation etc.)
for MobileNetv1/2: increase ALPHA and IMAGE_SIZE in train_model.py
other architectures: increase IMAGE_SIZE
add more layers
try out other loss functions (MAE, smooth L1 loss etc.)
other optimizer: SGD with momentum 0.9, adjust learning rate
use a feature pyramid
read keras-team/keras#9965

Increase training speed

increase BATCH_SIZE
less layers, IMAGE_SIZE and ALPHA

Overfitting

If the new dataset is small and similar to ImageNet, freeze all layers.
If the new dataset is small and not similar to ImageNet, freeze some layers.
If the new dataset is large, freeze no layers.
read http://cs231n.github.io/transfer-learning/

object-localization's People

Contributors

Stargazers

Watchers

object-localization's Issues

Unable to open file (unable to open file: name = 'model-0.29.h5', errno = 2, error message = 'No such file or directory', flags = 0, o_flags = 0)

Hi,

I've run the test.py script, but I got an error:

Unable to open file (unable to open file: name = 'model-0.29.h5', errno = 2, error message = 'No such file or directory', flags = 0, o_flags = 0)

I looked into my directory, but there isn't a file named like that. But I do have the files: 'model-0.44.h5' till 'model-0.51.h5'. Even when I changed the 'model-0.29.h5' into 'model-0.44.h5' in WEIGHTS_FILE in test.py and run again, it isn't working. Why this isn't working and how can I fix this? Should I have the file 'model-0.29.h5' into my directory? Should I change: save_best_only=TRUE into FALSE in the train.py script?

some images are corrupted

I see these two image files are corrupted each time I download them from the webpage:
beagle_116.jpg
chihuahua_121.jpg

Infer trained model on OpenVino

Is it possible to infer trained model in openvino or even in opencv dnn module, as the openvino supports standard object detection models like, mobilenet with SSD, yolo, mask-rccn,etc. and simple classification models with respective neural network architecture(ex. inception,mobilenet,etc), so is it possible to convert h5 keras models to convert into native tensorflow frozen graph and infer it on openvino.

IndexError: index 7 is out of bounds for axis 1 with size 7

Hi, I prepared the dataset and in starting the training I have this error. I am trying to understand what may come from but I'm having difficulties.

File "train.py", line 175, in getitem
batch_boxes[i, floor_y, floor_x, 0] = (y1 - y0) / image_height
IndexError: index 7 is out of bounds for axis 1 with size 1

FileNotFoundError: [Errno 2] No such file or directory: 'train.csv'

I downloaded the data-set but it kips saying No such file or directory: 'train.csv'

What does STD and MEAN do?

Hi, I'm using your code as a jumping off point for learning about localiser development, and was just a bit confused as to what STD and MEAN are explicitly used for? I think it looks like some kind of normalisation operation, but not 100% sure. Any info you can give would be greatly appreciated.
Cheers.

What if no ground truth available?

For example 1, if no ground truth available in the image, should I use all 0 to represent the gt bbox?

Dataset problem

Since the xml files have been generated, then can directly pass in to the neural network and train it? Or else need to manipulate them into array or csv file beforehand?

loss start too high

Have you any solution for this problem?The loss start to 17 and not decrement.

Empty train.csv and validation.csv files

Hello Lars, I would consider myself as a beginner in CNNs and would be grateful if you could assist with the problem I have. I have downloaded dataset and put .jpg and .xml files in the dataset folder. I have decided to put 4 instances for now to test the localization but it didn't work. train.csv and
validation.csv files that are generated at the beginning when running generate_dataset.py are empty. Do you have any suggestions on why it won't work for me?

Kind regards, Arseniy

confusion in iou calculation

def iou(y_true, y_pred):
    xA = K.maximum(y_true[...,0], y_pred[...,0])
    yA = K.maximum(y_true[...,1], y_pred[...,1])
    xB = K.minimum(y_true[...,2], y_pred[...,2])
    yB = K.minimum(y_true[...,3], y_pred[...,3])

    interArea = (xB - xA) * (yB - yA)

    boxAArea = (y_true[...,2] - y_true[...,0]) * (y_true[...,3] - y_true[...,1])
    boxBArea = (y_pred[...,2] - y_pred[...,0]) * (y_pred[...,3] - y_pred[...,1])
    return K.clip(interArea / (boxAArea + boxBArea - interArea + K.epsilon()), 0, 1)

in the above function, y_true and y_pred are directly coming from the train data and the model respectively - which are scaled coordinates. Shouldn't the iou function convert them back to xmin, ymin, xmax,ymax and then calculate the iou?

train_model.py

Reloaded modules: train_model
Traceback (most recent call last):

File "", line 1, in
runfile('C:/Users/huixu/Desktop/object-localization-0812/train_model.py', wdir='C:/Users/hhh/Desktop/object-localization-0812')

File "C:\Localdata\Python3\lib\site-packages\spyder\utils\site\sitecustomize.py", line 705, in runfile
execfile(filename, namespace)

File "C:\Localdata\Python3\lib\site-packages\spyder\utils\site\sitecustomize.py", line 102, in execfile
exec(compile(f.read(), filename, 'exec'), namespace)

File "C:/Users/hhh/Desktop/object-localization-0812/train_model.py", line 122, in
main()

File "C:/Users/hhh/Desktop/object-localization-0812/train_model.py", line 118, in main
train(model, EPOCHS, BATCH_SIZE, PATIENCE, TRAIN_CSV, VALIDATION_CSV, MEAN, STD)

File "C:/Users/hhh/Desktop/object-localization-0812/train_model.py", line 102, in train
train_datagen = DataSequence(train_csv, batch_size, mean, std)

File "C:/Users/hhh/Desktop/object-localization-0812/train_model.py", line 47, in init
for index, (path, x0, y0, x1, y1) in enumerate(reader):

ValueError: not enough values to unpack (expected 5, got 0)

Modifying the model for multi object localization in an image

Hi. thanks for your amazing work. this model works very nice. But i have a question. when i feed the model with an image with two or three cats, I have only one box and its accuracy is very low. How can I modify the model for multi detection? what parts of the train_model.py or evaluate_performance.py should be changed for that? Your response will be highly appreciated.

How to get the picture class_name?

Thanks your code,it is very cool.But now I want get pic cat or dog localization and classname,what should I do?

Model?

What model did you use, is that Faster_RCNN or other ?I was trying to do object localization using Faster_RCNN and i only prepare datasets with annotation but i don't know how to extract features from annotation file as an input to fed to region proposal layer?Can you help please?

TypeError: unsupported operand type(s) for -: 'tuple' and 'int'

When I run generate_dataset.py, I get this error. How can I fix this?

File "generate_dataset.py", line 115, in
main()
File "generate_dataset.py", line 62, in main
print("class {}: {} images".format(output[j-1][-2], i))
TypeError: unsupported operand type(s) for -: 'tuple' and 'int'

Program getting terminated after a particular error

Hi Lars,

I have tried running the code on a dataset for 100 epochs. After around 50 epochs, the error reduced from 2000(initially) to 25 but it stopped suddenly. How many times I have tried, I end up with same result. I couldn't figure out why it is stopping in between.

Could you help me in resolving this issue?

The exact error is placed in a screenshot. It is saying the val_iou didn't improve from 0.

Also, how to use GPU to run the python script. Currently it has multi-threading options which use CPU but I have a GPU supported system. How to change the access?

Thanks in advance lars
Srinath

Question: do not see images with bounding boxes as the output files.

Error message from evaluate_performance.py

runfile('/Users/huinaxu/Desktop/object-localization-master/evaluate_performance.py', wdir='/Users/huinaxu/Desktop/object-localization-master')
Reloaded modules: generate_dataset, train_model
IoU on training data
2949/29499497/294911/294916/294920/294925/294930/294934/294939/294944/294949/294954/294958/294963/294968/294971/294976/294981/294986/294991/294996/2949101/2949106/2949110/2949114/2949120/2949123/2949126/2949132/2949135/2949138/2949143/2949146/2949149/2949153/2949158/2949163/2949168/2949171/2949174/2949177/2949180/2949183/2949186/2949189/2949193/2949198/2949203/2949208/2949213/2949217/2949222/2949227/2949230/2949233/2949236/2949239/2949242/2949247/2949252/2949257/2949262/2949266/2949271/2949276/2949281/2949284/2949287/2949290/2949293/2949296/2949299/2949302/2949306/2949311/2949315/2949319/2949323/2949328/2949333/2949338/2949343/2949346/2949349/2949354/2949358/2949362/2949366/2949370/2949373/2949376/2949380/2949383/2949386/2949390/2949395/2949400/2949404/2949410/2949415/2949419/2949424/2949427/2949431/2949435/2949439/2949443/2949447/2949451/2949456/2949461/2949466/2949471/2949476/2949481/2949486/2949491/2949495/2949499/2949504/2949509/2949513/2949517/2949521/2949524/2949527/2949530/2949534/2949539/2949543/2949547/2949553/2949558/2949563/2949568/2949572/2949577/2949582/2949587/2949592/2949596/2949601/2949608/2949612/2949617/2949622/2949627/2949632/2949637/2949642/2949646/2949649/2949653/2949659/2949664/2949669/2949674/2949679/2949682/2949685/2949688/2949691/2949695/2949699/2949703/2949708/2949712/2949715/2949719/2949723/2949727/2949732/2949737/2949741/2949745/2949749/2949753/2949758/2949763/2949767/2949771/2949775/2949778/2949782/2949785/2949788/2949791/2949796/2949801/2949806/2949810/2949814/2949818/2949822/2949827/2949832/2949837/2949841/2949847/2949851/2949855/2949860/2949864/2949868/2949873/2949878/2949882/2949886/2949890/2949894/2949900/2949904/2949908/2949912/2949915/2949920/2949924/2949928/2949932/2949937/2949942/2949947/2949951/2949955/2949959/2949963/2949967/2949970/2949972/2949976/2949980/2949984/2949988/2949992/2949996/29491000/29491004/29491008/29491013/29491017/29491021/29491025/29491029/29491033/29491037/29491041/29491046/29491050/29491054/29491058/29491062/29491066/29491070/29491073/29491076/29491079/29491084/29491088/29491091/29491095/29491099/29491102/29491106/29491110/29491114/29491118/29491122/29491126/29491130/29491134/29491138/29491142/29491146/29491150/29491154/29491158/29491162/29491166/29491170/29491174/29491178/29491182/29491186/29491190/29491194/29491198/29491202/29491205/29491210/29491214/29491219/29491223/29491227/29491231/29491235/29491240/29491244/29491248/29491252/29491256/29491260/29491264/29491268/29491272/29491276/29491280/29491284/29491294/29491311/29491336/29491367/29491411/29491478/29491582/29491739/29491996/29492402/2949
Avg IoU: 0.5732865147951192
Highest IoU: 0.9722222222222222
Lowest IoU: 0.0

IoU on validation data
737/737103/737
Avg IoU: 0.5246796910795323
Highest IoU: 0.9227614490772386
Lowest IoU: 0.0

Trying out unscaled image
images/Egyptian_Mau_167.jpg
Traceback (most recent call last):

File "", line 1, in
runfile('/Users/---/Desktop/object-localization-master/evaluate_performance.py', wdir='/Users/---/Desktop/object-localization-master')

File "/anaconda3/lib/python3.6/site-packages/spyder/utils/site/sitecustomize.py", line 705, in runfile
execfile(filename, namespace)

File "/anaconda3/lib/python3.6/site-packages/spyder/utils/site/sitecustomize.py", line 102, in execfile
exec(compile(f.read(), filename, 'exec'), namespace)

File "/Users/---/Desktop/object-localization-master/evaluate_performance.py", line 119, in
main()

File "/Users/h---/Desktop/object-localization-master/evaluate_performance.py", line 108, in main
pred = predict_image(path, model)

File "/Users/---/Desktop/object-localization-master/evaluate_performance.py", line 42, in predict_image
if im.shape[0] != IMAGE_SIZE:

AttributeError: 'NoneType' object has no attribute 'shape'

Issue found from all the commits. Might be my tensorflow version issue?

HHH-MBP:object-localization-master1 huinaxu$ python3 example_1/train_model.py
/Users/hhh/anaconda3/lib/python3.6/site-packages/h5py/init.py:36: FutureWarning: Conversion of the second argument of issubdtype from float to np.floating is deprecated. In future, it will be treated as np.float64 == np.dtype(float).type.
from ._conv import register_converters as _register_converters
Using TensorFlow backend.
Traceback (most recent call last):
File "example_1/train_model.py", line 130, in
main()
File "example_1/train_model.py", line 125, in main
model = create_model(IMAGE_SIZE, ALPHA)
File "example_1/train_model.py", line 75, in create_model
model = MobileNetV2(input_shape=(size, size, 3), include_top=False, alpha=alpha)
File "/Users/hhh/anaconda3/lib/python3.6/site-packages/keras_applications/mobilenet_v2.py", line 355, in MobileNetV2
expansion=1, block_id=0)
File "/Users/hhh/anaconda3/lib/python3.6/site-packages/keras_applications/mobilenet_v2.py", line 461, in _inverted_res_block
in_channels = inputs._keras_shape[-1]
AttributeError: 'Tensor' object has no attribute '_keras_shape'

lars76 / object-localization Goto Github PK

object-localization's Introduction

object-localization

Dataset

Single-object detection

Example 1: Finding dogs/cats

Architecture

Result

Example 2: Finding dogs/cats and distinguishing classes

Changes

Multi-object detection

Example 3: Segmentation-like detection

Architecture

Result

Example 4: YOLO-like detection

Architecture

Result

Guidelines

Improve accuracy (IoU)

Increase training speed

Overfitting

object-localization's People

Contributors

Stargazers

Watchers

Forkers

object-localization's Issues

Recommend Projects

Recommend Topics

Recommend Org