omni-us / squeezedet-keras Goto Github PK
View Code? Open in Web Editor NEWKeras implementation of the Squeeze Det Object Detection Deep Learning Framework
License: MIT License
Keras implementation of the Squeeze Det Object Detection Deep Learning Framework
License: MIT License
Hello!
First of all I have to say that this is a great repository!!! I would like just to notice that some TODOs in this file
which could much copy-pasted from this repository to make able dynamic data augmentation inside GPU.
It is also reported here that doing so together with anchor matching inside GPU makes the training time 1.8 times faster.
squeezedet-keras/main/model/evaluation.py
Line 493 in 129eaa3
three quotes are missing at the line above, thus the recall score does not return any value and results in error.
Hi,
I am wondering how long it took you to train the model . How many epochs did you have to run it for. I'm trying to run it on the coco dataset. After 150 epochs, I see the loss is still reducing but the eval.py script shows the model getting worse. I see the model generating too ,many bounding boxes. Any help would be appreciated.
Thanks.
I am newbie to this object detection area.
I have gone through you code and noticed you have declared cfg.ANCHOR_SEED with some pre-defined values. How do we calculate for new dataset? shall we use the same value for any images object detection?
I have go through another link which is related to the object detection using yolo (https://github.com/experiencor/keras-yolo3/blob/master/gen_anchors.py). It seems there they have generate anchor for dataset separately.
So, can you please guide me here.
I am trying to run the following training part:
python ../../scripts/train.py --init ../../main/model/imagenet.h5
But it is neither detecting the 'main', nor the 'tensorflow', giving:
ModuleNotFoundError: no module named main
ModuleNotFoundError: no module named tensorflow
Can you please help?
Thank you!
Hi,
As trying to train my model I have found the following issue in the DataGenerator.py file.
In read_image_and_gt(), the original width and height are stored after the image was resized thus resulting in the fact that the scaling coefficients for X and Y will always be 1. If you were to train on images with different sizes this could be a major problem, because the bounding boxes won't be scaled accordingly.
def read_image_and_gt(img_files, gt_files, config):
.............
for img_name, gt_name in zip(img_files, gt_files):
#open img
img = cv2.imread(img_name).astype(np.float32, copy=False)
# scale image
img = cv2.resize( img, (config.IMAGE_WIDTH, config.IMAGE_HEIGHT))
#subtract means
img = (img - np.mean(img))/ np.std(img)
#store original height and width?
orig_h, orig_w, _ = [float(v) for v in img.shape]
.............................................................
# scale annotation
x_scale = config.IMAGE_WIDTH / orig_w
y_scale = config.IMAGE_HEIGHT / orig_h
# scale boxes
bboxes_per_file[:, 0::2] = bboxes_per_file[:, 0::2] * x_scale
bboxes_per_file[:, 1::2] = bboxes_per_file[:, 1::2] * y_scale
I am trying to use my trained model to infer object detections in images.
By looking at the code (mainly the one using to produce image summaries in tensorboard) I managed to gather all functions needed for this task. Nevertheless early results seem a bit peculiar and I would like to know that I didn't do any mistake.
Basically I am reimplementing the model, load the weights, read an image and finally use model.predict()
to get the object detection results.
Image is loaded via:
img_path = 'path/to/an/image'
img = image.load_img(img_path, target_size=(height, width))
x = image.img_to_array(img)
x = np.expand_dims(x, axis=0)
Something like:
y_pred = model.predict(x)
all_filtered_boxes, all_filtered_classes, all_filtered_scores = filter_batch(y_pred, cfg)
which produce a y_pred
of shape (16848,12) so I pass it through filter_batch
to acquire usable results, i.e. bboxes, classes and scores.
Is this the valid method to do it? Did I miss something in the process?
Hi,
how can I use pretrained TF model (on KITTI dataset) from original repo:
https://github.com/BichenWuUCB/squeezeDet ?
It is stored as binary file there, whereas we need hdf5 file.
Or perhaps you can provide your fine-tuned model on KITTI dataset? (Unfortunately no GPU here no train myself).
there's a typo on the line 151 of the train.py file:
#use default is nothing is given
"is" should be replace by "if"
I understand we finally need to give (x,y,w,h) to the network where x,y is grid centre and w,h are width,height of grid. But, in the create_config.py I don't understand how seed is used for calculating anchor boxes. It's being multiplied by (H,W) and reshaped to (H,W,B,2) what is 2 here? Is that 2 really needed? I have to change anchor box dimensions for my dataset. How do I do that?
Hi,
I wonder how it would be possible to run hard negative mining as in SSD paper?
" Instead of using all the negative examples, we sort them using the highest confidence loss for each
default box and pick the top ones so that the ratio between the negatives and positives is
at most 3:1. "
I've been running this on the KITTI dataset on and off for a while now, but I still cannot produce the results I can produce using the original TF SqueezeDet. I can achieve around 85% for car, 75% for pedestrian using the original, but I can only achieve around 80% for car, 55% for pedestrian using this version after 100 epochs. mAP doesn't get much higher than 60%. Has anyone tried training for 100 epochs and care to share their results? Anyone know why the original version gives better performance?
Hi there,
Using the original setting the loss starts at about 33 but plateaus at about 3.0.
Any recommendations?
I'm curious what's the ultimate loss that you get after 100 epochs.
Thank you.
Hi
I should first say that this is a great project! I was just reading through a section of the source for evaluation and noticed the notion of "MAP", which typically means "Mean Average Precision", is implemented as mean of interpolated average precision:
prmap_feed_dict[mAP_placeholder] = np.mean(AP[:,1], axis=0)
#save model with biggest mean average precision
if np.mean(AP[:,1], axis=0) > best_mAP:
best_mAP = np.mean(AP[:,1], axis=0)
best_mAP_ckpt = current_model
https://github.com/omni-us/squeezedet-keras/blob/master/scripts/eval.py#L378-L383
This implementation of MAP returns a larger value than the traditional, of course. Can I ask why the mean of inAP is used here?
Thanks
Hi
I noticed that there's a metric for "loss without regularisation", but is it actually meaningful? Also, the implementation of loss_without_regularization()
appears to be identical to loss()
. Am I missing something?
Thanks
Great work, and I want to try this out myself..
However, when I compile the model, I get such errors.
The shape of preds is preds: (?, 72, 78, 256)
and anchors: 16848 so it cannot reshape.
I was wondering how people is getting around this, or did I install it properly?
File "../../scripts/train.py", line 314, in
train()
File "../../scripts/train.py", line 131, in train
squeeze = SqueezeDet(cfg)
File "/home/ken/rev7_ken/keras/objectdetection/squeezedet-keras/main/model/squeezeDet.py", line 32, in init
self.model = self._create_model()
File "/home/ken/rev7_ken/keras/objectdetection/squeezedet-keras/main/model/squeezeDet.py", line 95, in _create_model
pred_reshaped = Reshape((self.config.ANCHORS, -1))(preds)
File "/home/ken/miniconda3/lib/python3.6/site-packages/keras/engine/base_layer.py", line 474, in call
output_shape = self.compute_output_shape(input_shape)
File "/home/ken/miniconda3/lib/python3.6/site-packages/keras/layers/core.py", line 394, in compute_output_shape
input_shape[1:], self.target_shape)
File "/home/ken/miniconda3/lib/python3.6/site-packages/keras/layers/core.py", line 379, in _fix_unknown_dimension
raise ValueError(msg)
ValueError: total size of new array must be unchanged
Hello
I've tried running train.py
and noticed there's no disk read activity after the first epoch. As far as I can see, images are read using cv2.imread
and I don't think there's any buffer mechanism implemented. Is it normal to have no disk read after the first epoch? If so, how is buffering achieved?
Thank you
Hello Everybody,
I'm trying to vizualize some representations.
`import numpy as np
import random
from tensorflow.keras.preprocessing.image import img_to_array, load_img
from tensorflow.keras import layers,models
from keras.preprocessing.image import ImageDataGenerator
import tensorflow as tf
successive_outputs = [layer.output for layer in squeeze.model.layers[1:]]
visualization_model = tf.keras.Model(inputs = squeeze.model.input, outputs = successive_outputs)`
When visualization_model is called the following error comes:
`
visualization_model = tf.keras.Model(inputs = squeeze.model.input, outputs = successive_outputs)
Traceback (most recent call last):
File "", line 1, in
visualization_model = tf.keras.Model(inputs = squeeze.model.input, outputs = successive_outputs)
File "C:\ICMS\anaconda3\envs\tensorflow\lib\site-packages\tensorflow\python\keras\engine\training.py", line 121, in init
super(Model, self).init(*args, **kwargs)
File "C:\ICMS\anaconda3\envs\tensorflow\lib\site-packages\tensorflow\python\keras\engine\network.py", line 81, in init
self._init_graph_network(*args, **kwargs)
File "C:\ICMS\anaconda3\envs\tensorflow\lib\site-packages\tensorflow\python\training\checkpointable\base.py", line 442, in _method_wrapper
method(self, *args, **kwargs)
File "C:\ICMS\anaconda3\envs\tensorflow\lib\site-packages\tensorflow\python\keras\engine\network.py", line 227, in _init_graph_network
self._track_layers(layers)
File "C:\ICMS\anaconda3\envs\tensorflow\lib\site-packages\tensorflow\python\keras\engine\network.py", line 327, in _track_layers
layer, name='layer-%d' % layer_index, overwrite=True)
File "C:\ICMS\anaconda3\envs\tensorflow\lib\site-packages\tensorflow\python\training\checkpointable\base.py", line 688, in _track_checkpointable
"Checkpointable.") % (type(checkpointable),))
TypeError: Checkpointable._track_checkpointable() passed type <class 'keras.engine.input_layer.InputLayer'>, not a Checkpointable.`
Thanks a lot for help.
Hello,
I have trained the network with KITTI dataset.
Can you please share code for testing of network i.e., visualization of images with bounding boxes for test image.
Thanks in advance
Hi,
I would like to use this repository to do some detections with my own labelled data and the annotations are in a csv file. Please let me know the exact layout of the annotations file that I need to use.
thanks!
I am trying to deploy the model trained from the repo and custom data to mobile and server side. I am getting strange errors.
@Sileadim was the model in your use case deployed in mobile and or server side for applications ?
any pointers would be helpful.
Thanks
Aman
Hi,
I am wondering how to do transfer learning for squeezedet using pretrained model. I also want to reduce number of classes to 1.
It seems you guys may have merged branches inappropriately at some point. For example, there are duplicate def read_image_and_gt_with_original(img_files, gt_files, config): in dataGenerator.py
I am training for a custom dataset. In squeezedet config file I wish to change the network input size for square images to prevent squashing. However when I attempt to do so the train program crashes with the following error:
ValueError: total size of new array must be unchanged [squeezeDet.py line 96]
Do I have to change the anchor shapes with respect to network input width and height, if so how do I go about it? I wish to change from 1248x384 -> 832x832
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.