Giter Club home page Giter Club logo

ssd.pytorch's Introduction

SSD: Single Shot MultiBox Object Detector, in PyTorch

A PyTorch implementation of Single Shot MultiBox Detector from the 2016 paper by Wei Liu, Dragomir Anguelov, Dumitru Erhan, Christian Szegedy, Scott Reed, Cheng-Yang, and Alexander C. Berg. The official and original Caffe code can be found here.

Table of Contents

       

Installation

  • Install PyTorch by selecting your environment on the website and running the appropriate command.
  • Clone this repository.
    • Note: We currently only support Python 3+.
  • Then download the dataset by following the instructions below.
  • We now support Visdom for real-time loss visualization during training!
    • To use Visdom in the browser:
    # First install Python server and client
    pip install visdom
    # Start the server (probably in a screen or tmux)
    python -m visdom.server
    • Then (during training) navigate to http://localhost:8097/ (see the Train section below for training details).
  • Note: For training, we currently support VOC and COCO, and aim to add ImageNet support soon.

Datasets

To make things easy, we provide bash scripts to handle the dataset downloads and setup for you. We also provide simple dataset loaders that inherit torch.utils.data.Dataset, making them fully compatible with the torchvision.datasets API.

COCO

Microsoft COCO: Common Objects in Context

Download COCO 2014
# specify a directory for dataset to be downloaded into, else default is ~/data/
sh data/scripts/COCO2014.sh

VOC Dataset

PASCAL VOC: Visual Object Classes

Download VOC2007 trainval & test
# specify a directory for dataset to be downloaded into, else default is ~/data/
sh data/scripts/VOC2007.sh # <directory>
Download VOC2012 trainval
# specify a directory for dataset to be downloaded into, else default is ~/data/
sh data/scripts/VOC2012.sh # <directory>

Training SSD

mkdir weights
cd weights
wget https://s3.amazonaws.com/amdegroot-models/vgg16_reducedfc.pth
  • To train SSD using the train script simply specify the parameters listed in train.py as a flag or manually change them.
python train.py
  • Note:
    • For training, an NVIDIA GPU is strongly recommended for speed.
    • For instructions on Visdom usage/installation, see the Installation section.
    • You can pick-up training from a checkpoint by specifying the path as one of the training parameters (again, see train.py for options)

Evaluation

To evaluate a trained network:

python eval.py

You can specify the parameters listed in the eval.py file by flagging them or manually changing them.

Performance

VOC2007 Test

mAP
Original Converted weiliu89 weights From scratch w/o data aug From scratch w/ data aug
77.2 % 77.26 % 58.12% 77.43 %
FPS

GTX 1060: ~45.45 FPS

Demos

Use a pre-trained SSD network for detection

Download a pre-trained network

SSD results on multiple datasets

Try the demo notebook

  • Make sure you have jupyter notebook installed.
  • Two alternatives for installing jupyter notebook:
    1. If you installed PyTorch with conda (recommended), then you should already have it. (Just navigate to the ssd.pytorch cloned repo and run): jupyter notebook

    2. If using pip:

# make sure pip is upgraded
pip3 install --upgrade pip
# install jupyter notebook
pip install jupyter
# Run this inside ssd.pytorch
jupyter notebook

Try the webcam demo

  • Works on CPU (may have to tweak cv2.waitkey for optimal fps) or on an NVIDIA GPU
  • This demo currently requires opencv2+ w/ python bindings and an onboard webcam
    • You can change the default webcam in demo/live.py
  • Install the imutils package to leverage multi-threading on CPU:
    • pip install imutils
  • Running python -m demo.live opens the webcam and begins detecting!

TODO

We have accumulated the following to-do list, which we hope to complete in the near future

  • Still to come:
    • Support for the MS COCO dataset
    • Support for SSD512 training and testing
    • Support for training on custom datasets

Authors

Note: Unfortunately, this is just a hobby of ours and not a full-time job, so we'll do our best to keep things up to date, but no guarantees. That being said, thanks to everyone for your continued help and feedback as it is really appreciated. We will try to address everything as soon as possible.

References

ssd.pytorch's People

Contributors

amdegroot avatar asoleimanib avatar astorfi avatar blackyang avatar cadene avatar cypherix avatar ellisbrown avatar mimoralea avatar tilt avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

ssd.pytorch's Issues

How to derive the 'steps' in config.py?

The 'steps' in config.py is [8, 16, 32, 64, 100, 300]. I am just wondering how to derive these numbers? I have read the papers which says 'f_k is the size of kth square feature maps', though I cannot relate it with the numbers you got. Thanks.

How to visualize the computational graph?

I saw the images that represent the graph, but they were blurred. Are there some ways or scripts to reproduce the graph during training and inference? In the file .gitignore it seems it was used visualize.py to generate those pictures. I need this because i think it helps a lot as a first step for a good understanding of the architecture and the functionality of the model itself.

What criteria of the scale factor(Prior Box) did you used in config.py?

Hello

I have a question about the criteria that you used in the config.py.

Since the original paper states about the scale factor to be 'regularly spaced', it seems your definition of scale factor is quit different from it.

For example, lets say 4 feature maps are used for prediction and if we define Smin and Smax to be 0.2 and 0.8 respectively, it results in (0.2 0.4 0.6 0.8) for each feature map scale factor.

However I found that your definition of scale factor
(30(0.1), 60(0.2), 111(0.37), 162(0.54), 213(0.71), 264(0.88)
seems to be not regularly spaced. The differences between scale factors are (0.1, 0.17 0.17 0.17 0.17). Do you have any special reasons to use it? (e.g Improving the accuracy?)

Any comments will be appreciated.
Thanks in advance.

Run eval.py and result is 71.x, compared with yours 77.x

Hi,

I just finished the training process and run the test process by eval.py; however, I got a much lower result (compared with the results reported in readme, see here.). After a further digging, there are some concerns:

  • the parameters, confidence_threshold and top_k, are not used at all. See here.
  • there is no NMS process? See here.

One more thing, you report that without pre-training and using data augmentation alone, you have a 77.43% performance. What setting do you use? I just Xavier init all layers without a pretrained model, keeping all other parameters unchanged as in your repo; the training completely fails (at first, the loss is ~15; then 20k iter it goes down to 7.x and kept the same for the rest iter (max_iter=120k); the test mAP is 0.4, cf with pre-train 71.x that I got).

Thanks so much for your help!
Hongyang, Francis

KeyError: 'unexpected key "0.weight" in state_dict'

when doing 'python test.py',the output is the following:
Traceback (most recent call last):
File "test.py", line 73, in
net.load_state_dict(torch.load(args.trained_model))
File "/home/qz/lzjqsdd/APP/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 331, in load_state_dict .format(name))
Do you have any idea why this could be happening?Thanks

Time profiling result is inconsistent with result from original caffe SSD

I wondering the time consumption at each part. (VGG, Extra, multi_box, detection)
From the result of caffe version, the VGG part accounts for up to 80 percent of time consumption.
However, in this version, the distribution of time consumption is as follow:

Total time : 0.018(seconds) per image
VGG part 8.4%
Extra layer 2.8%
Multi_box 61%
detect 27.5%
Most of time is from Multi_box and detect.
I measure it by python time.time()

And both total time for one image is almost the same.
caffe : 19ms
pytorch : 18ms

I wondering why this inconsistence happen?

A question about the L2Norm.py code

Hello, I don't understand why you calculate out = weight*x before return out and not return the x straightly.Could you tell me the reason?
thx~~ :)

def forward(self, x):
norm = x.pow(2).sum(1).sqrt()+self.eps
x/=norm.expand_as(x)
out = self.weight.unsqueeze(0).unsqueeze(2).unsqueeze(3).expand_as(x) * x <= here
return out

CUDNN_STATUS_ALLOC_FAILED

I got this issue coming up. I was able to fix it though by setting cudnn.benchmark = False and setting --batch_size to 8.

[question] Where can I calculate the MAP?

Hello @amdegroot,
thank you for making your code available. I am currently working in the ssd_keras port. However, we are missing the MAP score and I saw that you have already calculated yours. Do you think you could pin-point which code did you use to evaluate your SSD port.

Also I saw that you are missing the data_augmentation part. Maybe you could take a look here. It is a python generator, it is currently missing the crop transformation but it has helped me reach a better loss.

Thank you!

How to derive the math in box_utils.py?

def point_form(boxes):
    """ Convert prior_boxes to (xmin, ymin, xmax, ymax)
    representation for comparison to point form ground truth data.
    Args:
        boxes: (tensor) center-size default boxes from priorbox layers.
    Return:
        boxes: (tensor) Converted xmin, ymin, xmax, ymax form of boxes.
    """
    return torch.cat((boxes[:, :2] - boxes[:, 2:]/2,     # xmin, ymin
                     boxes[:, :2] + boxes[:, 2:]/2), 1)  # xmax, ymax


def center_size(boxes):
    """ Convert prior_boxes to (cx, cy, w, h)
    representation for comparison to center-size form ground truth data.
    Args:
        boxes: (tensor) point_form boxes
    Return:
        boxes: (tensor) Converted xmin, ymin, xmax, ymax form of boxes.
    """
    return torch.cat((boxes[:, 2:] + boxes[:, :2])/2,  # cx, cy
                     boxes[:, 2:] - boxes[:, :2], 1)  # w, h
            for i, k in enumerate(self.feature_maps):
                step_x = step_y = self.image_size/k
                for h, w in product(range(k), repeat=2):
                    c_x = ((w+0.5) * step_x)
                    c_y = ((h+0.5) * step_y)
                    c_w = c_h = self.min_sizes[i] / 2
                    s_k = self.image_size  # 300
                    # aspect_ratio: 1,
                    # size: min_size
                    mean += [(c_x-c_w)/s_k, (c_y-c_h)/s_k,
                             (c_x+c_w)/s_k, (c_y+c_h)/s_k]
                    if self.max_sizes[i] > 0:
                        # aspect_ratio: 1
                        # size: sqrt(min_size * max_size)/2
                        c_w = c_h = sqrt(self.min_sizes[i] *
                                         self.max_sizes[i])/2
                        mean += [(c_x-c_w)/s_k, (c_y-c_h)/s_k,
                                 (c_x+c_w)/s_k, (c_y+c_h)/s_k]
                    # rest of prior boxes
                    for ar in self.aspect_ratios[i]:
                        if not (abs(ar-1) < 1e-6):
                            c_w = self.min_sizes[i] * sqrt(ar)/2
                            c_h = self.min_sizes[i] / sqrt(ar)/2
                            mean += [(c_x-c_w)/s_k, (c_y-c_h)/s_k,
                                     (c_x+c_w)/s_k, (c_y+c_h)/s_k]
  1. When I cross reference with prior_box.py, it seems that the math does not give what is written in the comments. center_size seems right but I think you also need to divide by 2 in the second expression for w and h?
  2. I cannot derive the math for point form.

Can you kindly verify? Thanks.

Possible bug?

_,loss_idx = loss_c.sort(1, descending=True)
 _,idx_rank = loss_idx.sort(1)

Just wondering, is this a bug? I don't think you should put descending=True to find idx_rank.

SSD keep_difficult=False Problem?

Hello

It seems the code trains the network(SSD) without difficult training sets.

Additionally I also trained the network with 07 ++ 12 train set (07 trainval, 07 test, 12 trainval) and tested with 12 test set using the official server. And the result was 74.1%, 2% below from the latest version of SSD300(75.8%). Of course there will be differences in the library(pytorch vs caffe), it seems like the network which was trained only with easy sets would not be able to achieve the original performance.

Any comments will be appreciated.
Thanks in advance.

Error when run test.py

Hi, there is a problem hope you can help me, thank you.

File "/home/hd/ssd.pytorch/data/voc.py", line 222
gts.append([label, *(int(bb.text) - 1 for bb in bbox)])
^
SyntaxError: invalid syntax

This error occur when I run the test.py. Thank you again.
^ under the *

RGB vs BGR?

Hello,
I was looking at your implementation and I believe the input to your model is an image with RGB ordering. I was also looking at the keras implementation and they use BGR values. I have been also testing with an map evaluation script and it seems that I get better results ;using the weights that you provided from the original caffe implementation, when I use BGR instead of RBG. Do you happen to know which order should we follow when using the original caffe weights?

Thank you very much :)

The difference in compute area_a and area_b in jaccard function

I am working on adding the randomhorizontalflip to this repo . But I always get the Nan loss in smoothl1loss ,and when I read the jaccard function in detail,I find:

area_a = ((box_a[:, 2]-box_a[:, 0]) *
          (box_a[:, 3]-box_a[:, 1])).unsqueeze(1).expand_as(inter)  # [A,B]
area_b = ((box_b[:, 2]-box_b[:, 0]) *
          (box_b[:, 3]-box_b[:, 1])).unsqueeze(0).expand_as(inter) 

why the unsqueeze num is different? I don't understand.

PriorBox: the box is out of the image

The prior_box.py using the version v2: every box can be described as (x, y, w, h) instead of the v1 (x1, y1, x2, y2)
using the clamp_(max=1, min=0) will cause the 'bottom box' out of the image. For example: the output[-5, :] is 0.8333, 0.8333, 0.5020, 1.000。 so the x2 and y2 is out of the image. I am not sure whether it will cause the accuracy, maybe can modify it like the v1. (maybe it will not be a problem)

bug: test.py (BaseTransform)

line41: x = torch.from_numpy(transform(img)[0]).permute(2, 0, 1) is not change the bgr to rgb. It's not equal to the dataset = VOCDetection(args.voc_root, [('2007', set_type)], BaseTransform(300, dataset_mean), AnnotationTransform()) (it change bgr to rgb).

So, I think it's better to add change the line138 img = img[:, :, (2, 1, 0)] in voc0712 to the base_transform function's * (The results will not change too much if we set vis_threshold=0.6, however in the eval.py, if we use BaseTransform out the dataset, it will change the mAP)

Not use RandomHorizontalFlip?

The train_transform() is not used in the base_transform. So does this project use RandomHorizontalFlip?
Or this function is called other place?

Number of priors wrong on Multi-GPU mode

Hi there~

The PriorBox would encounter an error on multi-GPU mode. For example, when running on one GPU, the output size would be:

  • size(loc_data) = (16, 8732, 4)
  • size(conf_data) = (16, 8732, 21)
  • size(priors) = (8732, 4)

This is correct. But When running on 2 GPUs, the size of priors would be (17464, 4) and (26196, 4) on 3 GPUs, while the sizes of loc_data and conf_data remain the same as they are on 1 GPU.

ps. I found this bug when applying net = torch.nn.DataParallel(net).cuda() in train.py

Hope to see the solution.

Thanks.

0/1-indexing error in eval.py

i note that in loading the xml file, 0-indexing is used:

        obj_struct['bbox'] = [int(bbox.find('xmin').text) - 1,
                              int(bbox.find('ymin').text) - 1,
                              int(bbox.find('xmax').text) - 1,
                              int(bbox.find('ymax').text) - 1]

however, the detections are 1-indexing

                # the VOCdevkit expects 1-based indices
                for k in range(dets.shape[0]):
                    f.write('{:s} {:.3f} {:.1f} {:.1f} {:.1f} {:.1f}\n'.
                            format(index[1], dets[k, -1],
                                   dets[k, 0] + 1, dets[k, 1] + 1,
                                   dets[k, 2] + 1, dets[k, 3] + 1))

if you use consistent indexing, the mAP for model ssd300_mAP_77.43_v2.pth should be 0.775538

Can't train

kaan@ALTAR:ssd.pytorch$ python3 train.py
Loading base network...
Initializing weights...
Loading Dataset...
Training SSD on VOC0712
Traceback (most recent call last):
File "train.py", line 231, in
train()
File "train.py", line 170, in train
images, targets = next(batch_iterator)
File "/usr/local/lib/python3.5/dist-packages/torch/utils/data/dataloader.py", line 201, in next
return self._process_next_batch(batch)
File "/usr/local/lib/python3.5/dist-packages/torch/utils/data/dataloader.py", line 221, in _process_next_batch
raise batch.exc_type(batch.exc_msg)
AttributeError: Traceback (most recent call last):
File "/usr/local/lib/python3.5/dist-packages/torch/utils/data/dataloader.py", line 40, in _worker_loop
samples = collate_fn([dataset[i] for i in batch_indices])
File "/usr/local/lib/python3.5/dist-packages/torch/utils/data/dataloader.py", line 40, in
samples = collate_fn([dataset[i] for i in batch_indices])
File "/home/kaan/ssd.pytorch/data/voc0712.py", line 117, in getitem
im, gt, h, w = self.pull_item(index)
File "/home/kaan/ssd.pytorch/data/voc0712.py", line 129, in pull_item
height, width, channels = img.shape
AttributeError: 'NoneType' object has no attribute 'shape'

Error in training

I got the following error when training

THCudaCheck FAIL file=/b/wheel/pytorch-src/torch/lib/THC/generic/THCTensorMath.cu line=226 error=59 : device-side assert triggered
Traceback (most recent call last):
  File "train_cars.py", line 232, in <module>
    train()
  File "train_cars.py", line 184, in train
    loss_l, loss_c = criterion(out, targets)
  File "/usr/local/lib/python2.7/dist-packages/torch/nn/modules/module.py", line 206, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/mshah/code/ssd.pytorch/layers/modules/multibox_loss.py", line 70, in forward
    match(self.threshold,truths,defaults,self.variance,labels,loc_t,conf_t,idx)
  File "/home/mshah/code/ssd.pytorch/layers/box_utils.py", line 107, in match
    loc = encode(matches, priors, variances)
  File "/home/mshah/code/ssd.pytorch/layers/box_utils.py", line 133, in encode
    return torch.cat([g_cxcy, g_wh], 1)  # [num_priors,4]
RuntimeError: cuda runtime error (59) : device-side assert triggered at /b/wheel/pytorch-src/torch/lib/THC/generic/THCTensorMath.cu:226

Can I know how to fix?

RunTime Error in Training with default values

python train.py Loading base network... Initializing weights... Loading Dataset... Training SSD on VOC0712 Traceback (most recent call last): File "train.py", line 232, in <module> train() File "train.py", line 181, in train out = net(images) File "/users/gpu/utkrsh/anaconda3/envs/pytorch/lib/python3.6/site-packages/torch/nn/modules/module.py", line 224, in __call__ result = self.forward(*input, **kwargs) File "/users/gpu/utkrsh/anaconda3/envs/pytorch/lib/python3.6/site-packages/torch/nn/parallel/data_parallel.py", line 60, in forward outputs = self.parallel_apply(replicas, inputs, kwargs) File "/users/gpu/utkrsh/anaconda3/envs/pytorch/lib/python3.6/site-packages/torch/nn/parallel/data_parallel.py", line 70, in parallel_apply return parallel_apply(replicas, inputs, kwargs, self.device_ids[:len(replicas)]) File "/users/gpu/utkrsh/anaconda3/envs/pytorch/lib/python3.6/site-packages/torch/nn/parallel/parallel_apply.py", line 67, in parallel_apply raise output File "/users/gpu/utkrsh/anaconda3/envs/pytorch/lib/python3.6/site-packages/torch/nn/parallel/parallel_apply.py", line 42, in _worker output = module(*input, **kwargs) File "/users/gpu/utkrsh/anaconda3/envs/pytorch/lib/python3.6/site-packages/torch/nn/modules/module.py", line 224, in __call__ result = self.forward(*input, **kwargs) File "/data/gpu/utkrsh/code/ssd.pytorch/ssd.py", line 76, in forward s = self.L2Norm(x) File "/users/gpu/utkrsh/anaconda3/envs/pytorch/lib/python3.6/site-packages/torch/nn/modules/module.py", line 224, in __call__ result = self.forward(*input, **kwargs) File "/data/gpu/utkrsh/code/ssd.pytorch/layers/modules/l2norm.py", line 21, in forward x/=norm.expand_as(x) File "/users/gpu/utkrsh/anaconda3/envs/pytorch/lib/python3.6/site-packages/torch/autograd/variable.py", line 725, in expand_as return Expand.apply(self, (tensor.size(),)) File "/users/gpu/utkrsh/anaconda3/envs/pytorch/lib/python3.6/site-packages/torch/autograd/_functions/tensor.py", line 111, in forward result = i.expand(*new_size) RuntimeError: The expanded size of the tensor (512) must match the existing size (8) at non-singleton dimension 1. at /opt/conda/conda-bld/pytorch_1502009910772/work/torch/lib/THC/generic/T$CTensor.c:323

I am getting the above stack trace after running train.py for default values. The dataset and weights were downloaded in the default location.
I am using python 3.6 and pytorch 0.2.0
I do understand the meaning of the error, I am just not able to find the source. Can anyone point in the right direction?

Run time error: tensors are on different GPUs

When I run the Demo Jupyter Notebook, I got a runtime error when "y = net(xx)". I have a GPU[0].
Thank you very much

RuntimeError Traceback (most recent call last)
in ()
3 xx = xx.cuda()
4 print(xx.t())
----> 5 y = net(xx)

/home/tech/anaconda3/envs/tf35/lib/python3.5/site-packages/torch/nn/modules/module.py in call(self, *input, **kwargs)
204
205 def call(self, *input, **kwargs):
--> 206 result = self.forward(*input, **kwargs)
207 for hook in self._forward_hooks.values():
208 hook_result = hook(self, input, result)

/home/tech/ssd.pytorch/ssd.py in forward(self, x)
72 # apply vgg up to conv4_3 relu
73 for k in range(23):
---> 74 x = self.vggk
75
76 s = self.L2Norm(x)

/home/tech/anaconda3/envs/tf35/lib/python3.5/site-packages/torch/nn/modules/module.py in call(self, *input, **kwargs)
204
205 def call(self, *input, **kwargs):
--> 206 result = self.forward(*input, **kwargs)
207 for hook in self._forward_hooks.values():
208 hook_result = hook(self, input, result)

/home/tech/anaconda3/envs/tf35/lib/python3.5/site-packages/torch/nn/modules/conv.py in forward(self, input)
235 def forward(self, input):
236 return F.conv2d(input, self.weight, self.bias, self.stride,
--> 237 self.padding, self.dilation, self.groups)
238
239

/home/tech/anaconda3/envs/tf35/lib/python3.5/site-packages/torch/nn/functional.py in conv2d(input, weight, bias, stride, padding, dilation, groups)
38 f = ConvNd(_pair(stride), _pair(padding), _pair(dilation), False,
39 _pair(0), groups, torch.backends.cudnn.benchmark, torch.backends.cudnn.enabled)
---> 40 return f(input, weight, bias)
41
42

RuntimeError: tensors are on different GPUs

Variance not used in priorbox?

I notice that you did not use the variance in priorbox. Is it supposed to be like this? The caffe code has the following which you seem to have left out

top_data += top[0]->offset(0, 1);
  if (variance_.size() == 1) {
    caffe_set<Dtype>(dim, Dtype(variance_[0]), top_data);
  } else {
    int count = 0;
    for (int h = 0; h < layer_height; ++h) {
      for (int w = 0; w < layer_width; ++w) {
        for (int i = 0; i < num_priors_; ++i) {
          for (int j = 0; j < 4; ++j) {
            top_data[count] = variance_[j];
            ++count;
          }
        }
      }
    }
  }
}

Though I do not understand what the offset and caffe_set does. Do you have any idea?

the training error will go to Nan by using your default parameter

Loading base network...
Initializing weights...
Loading Dataset...
Training SSD on VOC2007
Timer: 6.7833 sec.
iter 0 || Loss: 26.1034 || Timer: 0.2098 sec.
iter 10 || Loss: 15.1629 || Timer: 0.2115 sec.
iter 20 || Loss: 15.4713 || Timer: 0.2101 sec.
iter 30 || Loss: 17.6274 || Timer: 0.2153 sec.
iter 40 || Loss: 31.7296 || Timer: 0.2107 sec.
iter 50 || Loss: nan || Timer: 0.2113 sec.
iter 60 || Loss: nan || Timer: 0.2073 sec.
iter 70 || Loss: nan || Timer: 0.2035 sec.
iter 80 || Loss: nan || Timer: 0.2090 sec.
iter 90 || Loss: nan || Timer: 0.2055 sec.
iter 100 || Loss: nan || Timer: 0.2196 sec.
iter 110 || Loss: nan || Timer: 0.2064 sec.
iter 120 || Loss: nan || Timer: 0.2257 sec.
iter 130 || Loss: nan || Timer: 0.2051 sec.
iter 140 || Loss: nan || Timer: 0.2142 sec.
iter 150 || Loss: nan || Timer: 0.2056 sec.
iter 160 || Loss: nan || Timer: 0.2122 sec.
iter 170 || Loss: nan || Timer: 0.2090 sec.
iter 180 || Loss: nan || Timer: 0.2091 sec.
iter 190 || Loss: nan || Timer: 0.2110 sec.

Error while modified to train my own dataset

Hi, I recently modify your code to train my own dataset.
Basically I did following changes:
1.Change the classes, num_classes
2. Change the dataset path
3. Change the RGB mean value of the dataset

Then I ran the modified train.py and encounter with an error:
CUDA_LAUNCH_BLOCKING=1 python train_button.py
Loading base network...
Initializing weights...
Loading Dataset...
Training SSD on button
/b/wheel/pytorch-src/torch/lib/THCUNN/ClassNLLCriterion.cu:52: void cunn_ClassNLLCriterion_updateOutput_kernel(Dtype *, Dtype *, Dtype *, long *, Dtype *, int, int, int, int) [with Dtype = float, Acctype = float]: block: [0,0,0], thread: [9,0,0] Assertion t >= 0 && t < n_classes failed.
/b/wheel/pytorch-src/torch/lib/THCUNN/ClassNLLCriterion.cu:52: void cunn_ClassNLLCriterion_updateOutput_kernel(Dtype *, Dtype *, Dtype *, long *, Dtype *, int, int, int, int) [with Dtype = float, Acctype = float]: block: [0,0,0], thread: [10,0,0] Assertion t >= 0 && t < n_classes failed.
/b/wheel/pytorch-src/torch/lib/THCUNN/ClassNLLCriterion.cu:52: void cunn_ClassNLLCriterion_updateOutput_kernel(Dtype *, Dtype *, Dtype *, long *, Dtype *, int, int, int, int) [with Dtype = float, Acctype = float]: block: [0,0,0], thread: [11,0,0] Assertion t >= 0 && t < n_classes failed.
/b/wheel/pytorch-src/torch/lib/THCUNN/ClassNLLCriterion.cu:52: void cunn_ClassNLLCriterion_updateOutput_kernel(Dtype *, Dtype *, Dtype *, long *, Dtype *, int, int, int, int) [with Dtype = float, Acctype = float]: block: [0,0,0], thread: [13,0,0] Assertion t >= 0 && t < n_classes failed.
/b/wheel/pytorch-src/torch/lib/THCUNN/ClassNLLCriterion.cu:52: void cunn_ClassNLLCriterion_updateOutput_kernel(Dtype *, Dtype *, Dtype *, long *, Dtype *, int, int, int, int) [with Dtype = float, Acctype = float]: block: [0,0,0], thread: [14,0,0] Assertion t >= 0 && t < n_classes failed.
/b/wheel/pytorch-src/torch/lib/THCUNN/ClassNLLCriterion.cu:52: void cunn_ClassNLLCriterion_updateOutput_kernel(Dtype *, Dtype *, Dtype *, long *, Dtype *, int, int, int, int) [with Dtype = float, Acctype = float]: block: [0,0,0], thread: [15,0,0] Assertion t >= 0 && t < n_classes failed.
/b/wheel/pytorch-src/torch/lib/THCUNN/ClassNLLCriterion.cu:52: void cunn_ClassNLLCriterion_updateOutput_kernel(Dtype *, Dtype *, Dtype *, long *, Dtype *, int, int, int, int) [with Dtype = float, Acctype = float]: block: [0,0,0], thread: [17,0,0] Assertion t >= 0 && t < n_classes failed.
/b/wheel/pytorch-src/torch/lib/THCUNN/ClassNLLCriterion.cu:52: void cunn_ClassNLLCriterion_updateOutput_kernel(Dtype *, Dtype *, Dtype *, long *, Dtype *, int, int, int, int) [with Dtype = float, Acctype = float]: block: [0,0,0], thread: [20,0,0] Assertion t >= 0 && t < n_classes failed.
/b/wheel/pytorch-src/torch/lib/THCUNN/ClassNLLCriterion.cu:52: void cunn_ClassNLLCriterion_updateOutput_kernel(Dtype *, Dtype *, Dtype *, long *, Dtype *, int, int, int, int) [with Dtype = float, Acctype = float]: block: [0,0,0], thread: [21,0,0] Assertion t >= 0 && t < n_classes failed.
/b/wheel/pytorch-src/torch/lib/THCUNN/ClassNLLCriterion.cu:52: void cunn_ClassNLLCriterion_updateOutput_kernel(Dtype *, Dtype *, Dtype *, long *, Dtype *, int, int, int, int) [with Dtype = float, Acctype = float]: block: [0,0,0], thread: [22,0,0] Assertion t >= 0 && t < n_classes failed.
/b/wheel/pytorch-src/torch/lib/THCUNN/ClassNLLCriterion.cu:52: void cunn_ClassNLLCriterion_updateOutput_kernel(Dtype *, Dtype *, Dtype *, long *, Dtype *, int, int, int, int) [with Dtype = float, Acctype = float]: block: [0,0,0], thread: [24,0,0] Assertion t >= 0 && t < n_classes failed.
/b/wheel/pytorch-src/torch/lib/THCUNN/ClassNLLCriterion.cu:52: void cunn_ClassNLLCriterion_updateOutput_kernel(Dtype *, Dtype *, Dtype *, long *, Dtype *, int, int, int, int) [with Dtype = float, Acctype = float]: block: [0,0,0], thread: [25,0,0] Assertion t >= 0 && t < n_classes failed.
THCudaCheck FAIL file=/b/wheel/pytorch-src/torch/lib/THCUNN/generic/ClassNLLCriterion.cu line=83 error=59 : device-side assert triggered
Traceback (most recent call last):
File "train_button.py", line 204, in
train()
File "train_button.py", line 160, in train
loss_l, loss_c = criterion(out, targets)
File "/home/deep-server/.pyenv/versions/anaconda3-4.2.0/lib/python3.5/site-packages/torch/nn/modules/module.py", line 206, in call
result = self.forward(*input, **kwargs)
File "/home/deep-server/Documents/Jingya/ssd.pytorch/modules/multibox_loss.py", line 110, in forward
loss_c = F.cross_entropy(conf_p, targets_weighted, size_average=False)
File "/home/deep-server/.pyenv/versions/anaconda3-4.2.0/lib/python3.5/site-packages/torch/nn/functional.py", line 509, in cross_entropy
return nll_loss(log_softmax(input), target, weight, size_average)
File "/home/deep-server/.pyenv/versions/anaconda3-4.2.0/lib/python3.5/site-packages/torch/nn/functional.py", line 477, in nll_loss
return f(input, target)
File "/home/deep-server/.pyenv/versions/anaconda3-4.2.0/lib/python3.5/site-packages/torch/nn/_functions/thnn/auto.py", line 41, in forward
output, *self.additional_args)
RuntimeError: cuda runtime error (59) : device-side assert triggered at /b/wheel/pytorch-src/torch/lib/THCUNN/generic/ClassNLLCriterion.cu:83

Can you please help me with the possible reason for the error?
It should related to the line of loss_c = F.cross_entropy(conf_p, targets_weighted, size_average=False)
But I don't understand how it would go wrong.
Thank you in advance.

error:Dimension out of range

When I run test.py, the sentence “y = net(x)” is error:
RuntimeError: dimension out of range - got 1 but the tensor is only 1D

Change:

'--cuda', default=True

thank you for help.

Running on GPU errors

I'm very new to pytorch I'm getting these errors when I run the test.py file

File "test.py", line 93, in
thresh=args.visual_threshold)
File "test.py", line 54, in test_net
y = net(x) # forward pass
File "/opt/conda/envs/pytorch-py35/lib/python3.5/site-packages/torch/nn/modules/module.py", line 206, in call
result = self.forward(*input, **kwargs)
File "/workspace/ssd.pytorch/ssd.py", line 102, in forward
self.priors # default boxes
File "/workspace/ssd.pytorch/layers/functions/detection.py", line 51, in forward
decoded_boxes = decode(loc_data[i], prior_data, self.variance)
File "/workspace/ssd.pytorch/layers/box_utils.py", line 152, in decode
priors[:, :2] + loc[:, :2] * variances[0] * priors[:, 2:],
File "/opt/conda/envs/pytorch-py35/lib/python3.5/site-packages/torch/tensor.py", line 283, in mul
return self.mul(other)
TypeError: mul received an invalid combination of arguments - got (torch.FloatTensor), but expected one of:

  • (float value)
    didn't match because some of the arguments have invalid types: (torch.FloatTensor)
  • (torch.cuda.FloatTensor other)
    didn't match because some of the arguments have invalid types: (torch.FloatTensor)

RuntimeError when converting image to tensor

Hi @amdegroot , I was trying to get the demo running and I'm having a problem when calling transform(img) of the BaseTransform class.

When doing python test.py the output is the following.

Finished loading model!
Testing image 1/4952....
Traceback (most recent call last):
  File "test.py", line 84, in <module>
    thresh=args.visual_threshold)
  File "test.py", line 39, in test_net
    x = Variable(transform(img).unsqueeze(0))
  File "/home/arian/Documents/proyecto-integrador/models/ssd/ssd-pytorch/data/data_augment.py", line 119, in __call__
    return torch.Tensor(img)
RuntimeError: tried to construct a tensor from a nested float sequence, but found an item of type numpy.float32 at index (0, 0, 0)

This happens in the demo notebook and in the test.py file.
Do you have any idea why this could be happening?.

Thanks,
Arian.

why did you set difficult as False?

Hello

I am wondering why did you set the difficult training set as False.
Since I found that original code uses difficult training set as well.

Thanks

cuda Runtime Error (77): an illegal memory access was encountered

iter 510 || Loss: 6.8001 || THCudaCheck FAIL file=/opt/conda/conda-bld/pytorch_1502009910772/work/torch/lib/THC/generated/../THCReduceAll.cuh line=334 error=77 : an illegal memory access was encountered Traceback (most recent call last): File "train.py", line 231, in <module> train() File "train.py", line 183, in train loss_l, loss_c = criterion(out, targets) File "/users/gpu/utkrsh/anaconda3/envs/pytorch/lib/python3.6/site-packages/torch/nn/modules/module.py", line 224, in __call__ result = self.forward(*input, **kwargs) File "/users/gpu/utkrsh/code/ssd.pytorch/layers/modules/multibox_loss.py", line 137, in forward conf_p = conf_data[(pos_idx+neg_idx).gt(0)].view(-1, self.num_classes) File "/users/gpu/utkrsh/anaconda3/envs/pytorch/lib/python3.6/site-packages/torch/autograd/variable.py", line 72, in __getitem__ return MaskedSelect.apply(self, key) File "/users/gpu/utkrsh/anaconda3/envs/pytorch/lib/python3.6/site-packages/torch/autograd/_functions/tensor.py", line 468, in forward return tensor.masked_select(mask) RuntimeError: cuda runtime error (77) : an illegal memory access was encountered at /opt/conda/conda-bld/pytorch_1502009910772/work/torch/lib/THC/generated/../THCReduceAll.cuh:334
I am trying to train the network with a slight modification in localization loss in multibox_loss.py. I keep on getting this error message for the same line of code. Also, when starting to train, there is a warning
/users/gpu/utkrsh/anaconda3/envs/pytorch/lib/python3.6/site-packages/torch/autograd/_functions/tensor.py:450: UserWarning: mask is not broadcastable to self, but they have the same number of elements. Falling back to deprecated pointwise behavior. return tensor.masked_fill_(mask, value)

I am training with batch_size=32 in train.py and everything else is at the default value. I have tried to modify the code but there is no impact on the warning and I keep getting this error.
Also, if I use a larger batch_size in train.py like 40, I get this illegal memory access error much earlier than with size 32.
Any suggestions for what might be wrong?

Why "best_truth_overlap.index_fill_(0, best_prior_idx, 2)?"

Can I know how does the following line ensures best prior? Why 2? What will happen if this line is not included? It seems to me that it is not necessary to have this line. Thanks.

best_truth_overlap.index_fill_(0, best_prior_idx, 2)  # ensure best prior

NaN values at Multibox encoding

I've tried to implement my own dataset detector, however at training time, the localization loss is NaN due to negative values present on g_wh layers/box_utils.py#L137, I don't know if this error is related to the format of the bounding boxes or if it's related to the output of the SSD model.

I would like to know if am I doing something wrong while loading the dataset or if the error is related to a bug on the base implementation.

how to obtain weiliu89 weights

I was wondering how I could obtain weiliu89 weights to validate 77.2%. I have changed some parts of the code and just want to validate if it is still reproducible. Thanks

Demo and Evaluation Detection

Hello finally I can run your code. But I don't know why the MAP is 0.0 for all. When I am using demo.ipynb I can't plot the bounding box, but the detection has been completed.
image
image
image

How to fix this?
-Thank you-

tensors are on different GPUs

I run the demo, return 'tensors are on different GPSs' , but I have only one GPU.
The demo was run successfully with CPU.
Can you put the process of using the GPU to release it?
Thank you very much!

runtime error

hi,

have you successfully run the train.py?
I encountered a runtime error saying: "div_ only supports scalar multiplication" from line "x/=norm.expand_as(x)" in modules/l2norm.py
Then I changed this line to "x = x.div(nor.expand_as(x))" but got another cuda runtime error "device-side assert triggered" from line "return torch.cat([g_cxcy, g_wh], 1)" in box_utils.py

BTW, i am using python 2.7 instead of python3.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.