Giter Club home page Giter Club logo

pytorch-yolov3's Introduction

PyTorch YOLO

A minimal PyTorch implementation of YOLOv3, with support for training, inference and evaluation.

YOLOv4 and YOLOv7 weights are also compatible with this implementation.

CI PyPI pyversions PyPI license

Installation

Installing from source

For normal training and evaluation we recommend installing the package from source using a poetry virtual environment.

git clone https://github.com/eriklindernoren/PyTorch-YOLOv3
cd PyTorch-YOLOv3/
pip3 install poetry --user
poetry install

You need to join the virtual environment by running poetry shell in this directory before running any of the following commands without the poetry run prefix. Also have a look at the other installing method, if you want to use the commands everywhere without opening a poetry-shell.

Download pretrained weights

./weights/download_weights.sh

Download COCO

./data/get_coco_dataset.sh

Install via pip

This installation method is recommended, if you want to use this package as a dependency in another python project. This method only includes the code, is less isolated and may conflict with other packages. Weights and the COCO dataset need to be downloaded as stated above. See API for further information regarding the packages API. It also enables the CLI tools yolo-detect, yolo-train, and yolo-test everywhere without any additional commands.

pip3 install pytorchyolo --user

Test

Evaluates the model on COCO test dataset. To download this dataset as well as weights, see above.

poetry run yolo-test --weights weights/yolov3.weights
Model mAP (min. 50 IoU)
YOLOv3 608 (paper) 57.9
YOLOv3 608 (this impl.) 57.3
YOLOv3 416 (paper) 55.3
YOLOv3 416 (this impl.) 55.5

Inference

Uses pretrained weights to make predictions on images. Below table displays the inference times when using as inputs images scaled to 256x256. The ResNet backbone measurements are taken from the YOLOv3 paper. The Darknet-53 measurement marked shows the inference time of this implementation on my 1080ti card.

Backbone GPU FPS
ResNet-101 Titan X 53
ResNet-152 Titan X 37
Darknet-53 (paper) Titan X 76
Darknet-53 (this impl.) 1080ti 74
poetry run yolo-detect --images data/samples/

Train

For argument descriptions have a look at poetry run yolo-train --help

Example (COCO)

To train on COCO using a Darknet-53 backend pretrained on ImageNet run:

poetry run yolo-train --data config/coco.data  --pretrained_weights weights/darknet53.conv.74

Tensorboard

Track training progress in Tensorboard:

poetry run tensorboard --logdir='logs' --port=6006

Storing the logs on a slow drive possibly leads to a significant training speed decrease.

You can adjust the log directory using --logdir <path> when running tensorboard and yolo-train.

Train on Custom Dataset

Custom model

Run the commands below to create a custom model definition, replacing <num-classes> with the number of classes in your dataset.

./config/create_custom_model.sh <num-classes>  # Will create custom model 'yolov3-custom.cfg'

Classes

Add class names to data/custom/classes.names. This file should have one row per class name.

Image Folder

Move the images of your dataset to data/custom/images/.

Annotation Folder

Move your annotations to data/custom/labels/. The dataloader expects that the annotation file corresponding to the image data/custom/images/train.jpg has the path data/custom/labels/train.txt. Each row in the annotation file should define one bounding box, using the syntax label_idx x_center y_center width height. The coordinates should be scaled [0, 1], and the label_idx should be zero-indexed and correspond to the row number of the class name in data/custom/classes.names.

Define Train and Validation Sets

In data/custom/train.txt and data/custom/valid.txt, add paths to images that will be used as train and validation data respectively.

Train

To train on the custom dataset run:

poetry run yolo-train --model config/yolov3-custom.cfg --data config/custom.data

Add --pretrained_weights weights/darknet53.conv.74 to train using a backend pretrained on ImageNet.

API

You are able to import the modules of this repo in your own project if you install the pip package pytorchyolo.

An example prediction call from a simple OpenCV python script would look like this:

import cv2
from pytorchyolo import detect, models

# Load the YOLO model
model = models.load_model(
  "<PATH_TO_YOUR_CONFIG_FOLDER>/yolov3.cfg",
  "<PATH_TO_YOUR_WEIGHTS_FOLDER>/yolov3.weights")

# Load the image as a numpy array
img = cv2.imread("<PATH_TO_YOUR_IMAGE>")

# Convert OpenCV bgr to rgb
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)

# Runs the YOLO model on the image
boxes = detect.detect_image(model, img)

print(boxes)
# Output will be a numpy array in the following format:
# [[x1, y1, x2, y2, confidence, class]]

For more advanced usage look at the method's doc strings.

Credit

YOLOv3: An Incremental Improvement

Joseph Redmon, Ali Farhadi

Abstract
We present some updates to YOLO! We made a bunch of little design changes to make it better. We also trained this new network that’s pretty swell. It’s a little bigger than last time but more accurate. It’s still fast though, don’t worry. At 320 × 320 YOLOv3 runs in 22 ms at 28.2 mAP, as accurate as SSD but three times faster. When we look at the old .5 IOU mAP detection metric YOLOv3 is quite good. It achieves 57.9 AP50 in 51 ms on a Titan X, compared to 57.5 AP50 in 198 ms by RetinaNet, similar performance but 3.8× faster. As always, all the code is online at https://pjreddie.com/yolo/.

[Paper] [Project Webpage] [Authors' Implementation]

@article{yolov3,
  title={YOLOv3: An Incremental Improvement},
  author={Redmon, Joseph and Farhadi, Ali},
  journal = {arXiv},
  year={2018}
}

Other

YOEO — You Only Encode Once

YOEO extends this repo with the ability to train an additional semantic segmentation decoder. The lightweight example model is mainly targeted towards embedded real-time applications.

pytorch-yolov3's People

Contributors

0asa avatar alexandrefassio avatar cahity avatar catalinolaru1 avatar ctrlxx avatar dependabot[bot] avatar developer0hye avatar eriklindernoren avatar everwinter23 avatar flova avatar hnishi avatar id9502 avatar jaagut avatar johagge avatar joshuachough avatar timforby avatar triwahyuu avatar v-iashin avatar wptoux avatar yrrah avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

pytorch-yolov3's Issues

no save_weights function

hi, I didn't found save_weights function in this code. how can I save the param and test result, thx...

About loading the labels

Why you do this one in the datasets.py?
x1 = w * (labels[:, 1] - labels[:, 3]/2)
y1 = h * (labels[:, 2] - labels[:, 4]/2)
x2 = w * (labels[:, 1] + labels[:, 3]/2)
y2 = h * (labels[:, 2] + labels[:, 4]/2)
# Adjust for added padding
x1 += pad[1][0]
y1 += pad[0][0]
x2 += pad[1][0]
y2 += pad[0][0]
# Calculate ratios from coordinates
#print labels,float(h) / float(padded_h), h, padded_h, w, padded_w
'''
labels[:, 1] = ((x1 + x2) / 2) / padded_w
labels[:, 2] = ((y1 + y2) / 2) / padded_h
labels[:, 3] *= w / padded_w
labels[:, 4] *= h / padded_h
'''

        labels[:, 3] *= float(w) / float(padded_w)
        labels[:, 4] *= float(h) / float(padded_h)
        labels[:, 1] = ((x1 + x2) / 2) / float(padded_w)
        labels[:, 2] = ((y1 + y2) / 2) / float(padded_h)

About the train and detector

in detect.py:(line 109)
for x1, y1, x2, y2, conf, cls_conf, cls_pred in detections:
IS this should be :
for x, y, w, h, conf, cls_conf, cls_pred in detections:
because in yolo layer: pred_boxes is [x,y,w,h]

error when test

Namespace(batch_size=1, class_path='data/coco.names', conf_thres=0.8, config_path='config/yolov3.cfg', image_folder='data/samples', img_size=416, n_cpu=8, nms_thres=0.4, weights_path='weights/yolov3.weights')
Traceback (most recent call last):
File "detect.py", line 41, in
model.load_weights(opt.weights_path)
File "/mnt/disk_4T_1/Project/Detection/PyTorch-YOLOv3/old_source_code/PyTorch-YOLOv3/models.py", line 250, in load_weights
bn_layer.running_mean.data.copy_(bn_rm)
File "/home/qian/anaconda3/envs/py35torch/lib/python3.5/site-packages/torch/tensor.py", line 407, in data
raise RuntimeError('cannot call .data on a torch.Tensor: did you intend to use autograd.Variable?')
RuntimeError: cannot call .data on a torch.Tensor: did you intend to use autograd.Variable?

Incorrect bounding box coordinates?

Hi, in utils.py, when transforming from center and width to the exact coordinates of box2, you do the following operation:
b2_x1, b2_x2 = box2[:, 0] - box1[:, 2] / 2, box2[:, 0] + box1[:, 2] / 2
b2_y1, b2_y2 = box2[:, 1] - box1[:, 3] / 2, box2[:, 1] + box1[:, 3] / 2
but I think it should be:
b2_x1, b2_x2 = box2[:, 0] - box2[:, 2] / 2, box2[:, 0] + box2[:, 2] / 2
b2_y1, b2_y2 = box2[:, 1] - box2[:, 3] / 2, box2[:, 1] + box2[:, 3] / 2
since the width and height are referred to the second box. Am I right or am I missing something? Thanks in advance!

Clarification about AP calculation

The average precision calculation, as done in test.py, and compute_ap() differentiates between a correct vector of, say, [1,1,1,0] and [1,0,1,1], that is, the order of the predictions matter.

Is this necessary for object detection? If yes, why?
If I have an image with 3 ground truths, and the detection vector contains 2 true positives and 1 false positive, why does it matter if I find:
true positive, true positive, false positive -> [1,1,0] vs
true positive, false positive, true positive -> [1,0,1]?
The same predictions are made, but there'll be different APs. Is it fair?

Edit: I am not saying the algorithm is wrong, but some things are fuzzy for me.

Thanks!

About Loss

How about the training loss value? My implementation of YOLOv3 loss is very big and difficult to converge because the gird cell not containing the object is too large. Do you change something different from the original implementation?

Training error

Hi, i wanna retrain use coco dataset.

there is error but i dont know how to solve it

Traceback (most recent call last):
  File "train.py", line 81, in <module>
    for batch_i, (_, imgs, targets) in enumerate(dataloader):
  File "/home/kdy/anaconda3/envs/pytorch/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 286, in __next__
    return self._process_next_batch(batch)
  File "/home/kdy/anaconda3/envs/pytorch/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 307, in _process_next_batch
    raise batch.exc_type(batch.exc_msg)
ValueError: Traceback (most recent call last):
  File "/home/kdy/anaconda3/envs/pytorch/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 57, in _worker_loop
    samples = collate_fn([dataset[i] for i in batch_indices])
  File "/home/kdy/anaconda3/envs/pytorch/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 57, in <listcomp>
    samples = collate_fn([dataset[i] for i in batch_indices])
  File "/home/kdy/yolov3/utils/datasets.py", line 71, in __getitem__
    h, w, _ = img.shape
ValueError: not enough values to unpack (expected 3, got 0)

Exception ignored in: <bound method _DataLoaderIter.__del__ of <torch.utils.data.dataloader._DataLoaderIter object at 0x7f19fb602d30>>
Traceback (most recent call last):
  File "/home/kdy/anaconda3/envs/pytorch/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 349, in __del__
    self._shutdown_workers()
  File "/home/kdy/anaconda3/envs/pytorch/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 328, in _shutdown_workers
    self.worker_result_queue.get()
  File "/home/kdy/anaconda3/envs/pytorch/lib/python3.6/multiprocessing/queues.py", line 337, in get
    return _ForkingPickler.loads(res)
  File "/home/kdy/anaconda3/envs/pytorch/lib/python3.6/site-packages/torch/multiprocessing/reductions.py", line 70, in rebuild_storage_fd
    fd = df.detach()
  File "/home/kdy/anaconda3/envs/pytorch/lib/python3.6/multiprocessing/resource_sharer.py", line 57, in detach
    with _resource_sharer.get_connection(self._id) as conn:
  File "/home/kdy/anaconda3/envs/pytorch/lib/python3.6/multiprocessing/resource_sharer.py", line 87, in get_connection
    c = Client(address, authkey=process.current_process().authkey)
  File "/home/kdy/anaconda3/envs/pytorch/lib/python3.6/multiprocessing/connection.py", line 487, in Client
    c = SocketClient(address)
  File "/home/kdy/anaconda3/envs/pytorch/lib/python3.6/multiprocessing/connection.py", line 614, in SocketClient
    s.connect(address)
ConnectionRefusedError: [Errno 111] Connection refused

about training loss

when i train with coco or my own data
the loss is not convergence especially for w and h

371: nGT 19, recall 13, AP 68.42% proposals 0, loss: x 3.118402, y 3.105752, w nan, h nan, conf 164.831421, cls 83.258507, total nan
372: nGT 42, recall 11, AP 26.19% proposals 0, loss: x 3.763822, y 4.745901, w nan, h nan, conf 174.948318, cls 122.696747, total nan
372: nGT 42, recall 21, AP 50.00% proposals 0, loss: x 6.076531, y 5.691019, w nan, h nan, conf 317.763611, cls 157.752930, total nan
372: nGT 42, recall 34, AP 80.95% proposals 0, loss: x 6.518974, y 5.134117, w nan, h nan, conf 478.700623, cls 175.281006, total nan
373: nGT 7, recall 5, AP 71.43% proposals 0, loss: x 0.831325, y 1.967510, w nan, h nan, conf 55.965015, cls 30.674187, total nan
373: nGT 7, recall 7, AP 100.00% proposals 0, loss: x 1.058911, y 1.067211, w nan, h nan, conf 101.454453, cls 30.674187, total nan
373: nGT 7, recall 7, AP 100.00% proposals 0, loss: x 0.877475, y 0.707009, w nan, h nan, conf 135.526093, cls 30.674187, total nan
374: nGT 29, recall 7, AP 24.14% proposals 0, loss: x 4.089700, y 4.640538, w nan, h nan, conf 129.530502, cls 113.932693, total nan
374: nGT 29, recall 12, AP 41.38% proposals 0, loss: x 3.668809, y 6.106874, w nan, h nan, conf 198.104630, cls 127.078773, total nan
374: nGT 29, recall 21, AP 72.41% proposals 0, loss: x 4.712764, y 5.478906, w nan, h nan, conf 295.735779, cls 127.078773, total nan
375: nGT 15, recall 4, AP 26.67% proposals 0, loss: x 2.289522, y 1.391438, w nan, h nan, conf 64.979248, cls 48.202293, total nan
375: nGT 15, recall 7, AP 46.67% proposals 0, loss: x 2.598335, y 2.003327, w nan, h nan, conf 118.649559, cls 61.348373, total nan
375: nGT 15, recall 11, AP 73.33% proposals 0, loss: x 3.218783, y 3.078263, w nan, h nan, conf 178.149246, cls 65.730400, total nan
376: nGT 14, recall 2, AP 14.29% proposals 0, loss: x 1.407205, y 2.768980, w nan, h nan, conf 46.998051, cls 48.202293, total nan
376: nGT 14, recall 6, AP 42.86% proposals 0, loss: x 1.456108, y 3.255125, w nan, h nan, conf 91.687042, cls 56.966347, total nan
376: nGT 14, recall 8, AP 57.14% proposals 0, loss: x 2.827966, y 2.196316, w nan, h nan, conf 131.248718, cls 61.348373, total nan
377: nGT 10, recall 7, AP 70.00% proposals 0, loss: x 0.662892, y 1.921333, w nan, h nan, conf 107.326248, cls 43.820267, total nan
377: nGT 10, recall 9, AP 90.00% proposals 0, loss: x 0.840957, y 1.428768, w nan, h nan, conf 140.404282, cls 43.820267, total nan
377: nGT 10, recall 10, AP 100.00% proposals 0, loss: x 0.755548, y 1.235753, w nan, h nan, conf 174.077911, cls 43.820267, total nan

Error when training coco @batch 1473, epoch 0

Epoch 0/1000, Batch 1473/7329 | Losses: x 0.126482, y 0.129091, w 0.692879, h 0.779148, conf 0.268520, cls 1.789521, total 3.785640
Traceback (most recent call last):
File "/mnt/diskb/even/yolov3_pytorch/train.py", line 97, in
for batch_i, (_, imgs, targets) in enumerate(dataloader):
File "/usr/local/lib/python3.5/dist-packages/torch/utils/data/dataloader.py", line 264, in next
batch = self.collate_fn([self.dataset[i] for i in indices])
File "/usr/local/lib/python3.5/dist-packages/torch/utils/data/dataloader.py", line 264, in
batch = self.collate_fn([self.dataset[i] for i in indices])
File "/mnt/diskb/even/yolov3_pytorch/utils/datasets.py", line 73, in getitem
h, w, _ = img.shape
ValueError: not enough values to unpack (expected 3, got 0)

Negative values for some coordinates

What is the reason of having negative values for some coordinates, when converting from
(center x, center y, width, height) to (x1, y1, x2, y2) (in non_max_suppresion function)?

For example, the values for x1 coordinates are calculated as:
box_corner[:, :, 0] = prediction[:, :, 0] - prediction[:, :, 2] / 2 and there are some negative values here.

box_corner[:, :, 0] <= 0 boils down to sigmoid(tx) + grid_x <= anchor_w * exp(tw) / 2.

How can this happen?

Thanks!

About the anchors

I have two questions about the anchors used in the config file of YOLOv3:

1: The smallest anchors are used at the end and the biggest at the begining, why is that and shouldn't it be the other way around ?

2: What format are the anchors ? Are they based on the size of the input, so if I compute relative anchor sizes (between 0 and 1) and that my input size is 416x416 then I just multiply the sizes by these values and write that in the config file ?

training error

[Epoch 0/30, Batch 0/14658] [Losses: x 0.155323, y 0.153960, w 1.839854, h 1.919111, conf 2.058636, cls 2.145512, total 8.272396, recall: 0.00000]
Traceback (most recent call last):
File "/mnt/diskb/even/yolov3_pytorch/train.py", line 100, in
model.seen += imgs.size(0)
File "/usr/local/lib/python3.5/dist-packages/torch/nn/modules/module.py", line 532, in getattr
type(self).name, name))
AttributeError: 'Darknet' object has no attribute 'seen'

Question regarding build_targets

I found that conf_mask[b, anch_ious > ignore_thres] = 0 will overwrite any previous ground truth target that has conf_mask set to 1.

    for t in range(target.shape[1]):
        if target[b, t].sum() == 0:
            continue
        .......
        # Calculate iou between gt and anchor shapes
        anch_ious = bbox_iou(gt_box, anchor_shapes)
        # Where the overlap is larger than threshold set mask to zero (ignore)
        conf_mask[b, anch_ious > ignore_thres] = 0    <---------------------
        # Find the best matching anchor box
        best_n = np.argmax(anch_ious)
        ...
        # Masks
        mask[b, best_n, gj, gi] = 1
        conf_mask[b, best_n, gj, gi] = 1    <-------------------

Let's say:

Iter1: anch_ious = [0.567 0.305 0.43], anch_ious>ignore_thres=0,best_n=0, conf_mask[b, 0, gj, gi]= 1
Iter2: anch_ious = [0.667 0.045 0.33], anch_ious>ignore_thres=0,best_n=0, conf_mask[b, 0, gj, gi]= 1
Iter3: anch_ious = [0.734 0.025 0.22], anch_ious>ignore_thres=0,best_n=0, conf_mask[b, 0, gj, gi]= 1

So every iteration is erasing the previous iteration's ground truth target's conf_mask if the target is selecting the same anchor. For the example above, only Iter3: conf_mask[b, 0, gj, gi]= 1is kept. It seems like more than half of the ground truth target's are usually ignored during training. Is this intended behavior and why would it work for training?

about trained weights

I followed the instruction and trained a model on COCO dataset which is provided by the author. After 20 epochs ,the total loss is about 0.2-0.3. When I use this new trained weights to detect the sample images, no detection can be made. Does anybody have the same situation?

torch.cuda.is_available bug

test.py and train.py always return cuda = True because boolean is reflecting function presence rather than cuda presence. Also redundant if statement. Suggest replace
cuda = True if torch.cuda.is_available else False

with
cuda = torch.cuda.is_available()

training error

when I run train.py I got a error:
Traceback (most recent call last):
File "/home/lc/Downloads/PyTorch-YOLOv3/train.py", line 88, in
loss += model(sub_imgs, sub_targets)
File "/usr/local/lib/python3.5/dist-packages/torch/nn/modules/module.py", line 491, in call
result = self.forward(*input, **kwargs)
File "/home/lc/Downloads/PyTorch-YOLOv3/models.py", line 214, in forward
x = module[0](x, targets)
File "/usr/local/lib/python3.5/dist-packages/torch/nn/modules/module.py", line 491, in call
result = self.forward(*input, **kwargs)
File "/home/lc/Downloads/PyTorch-YOLOv3/models.py", line 175, in forward
loss_cls = self.class_scale * self.ce_loss(pred_cls, tcls)
File "/usr/local/lib/python3.5/dist-packages/torch/nn/modules/module.py", line 491, in call
result = self.forward(*input, **kwargs)
File "/usr/local/lib/python3.5/dist-packages/torch/nn/modules/loss.py", line 759, in forward
self.ignore_index, self.reduce)
File "/usr/local/lib/python3.5/dist-packages/torch/nn/functional.py", line 1442, in cross_entropy
return nll_loss(log_softmax(input, 1), target, weight, size_average, ignore_index, reduce)
File "/usr/local/lib/python3.5/dist-packages/torch/nn/functional.py", line 944, in log_softmax
return torch._C._nn.log_softmax(input, dim)
RuntimeError: dimension out of range (expected to be in range of [-1, 0], but got 1)
this error place is in :
def forward(self, x, targets=None):
elif module_def['type'] == 'yolo':
x = module[0](x, targets)
# If predictions: concatenate / if loss: add to total loss
output = output + [x] if targets is None else output + x

no result

I trained my own dataset. Now i use detect.py to detect the image, but i only find a big white bounding box around the image, there is nothing else. Does anyone help? Thanks.

Issue when detecting with own weights

I trained the model on my own dataset, and got weights from that. When I want to detect objects using these weights, I get the following error:

Traceback (most recent call last): File "detect_OP.py", line 42, in <module> model.load_weights(opt.weights_path) File "/home/robzelluf/Desktop/PyTorch-YOLOv3/models.py", line 265, in load_weights conv_w = torch.from_numpy(weights[ptr:ptr + num_w]).view_as(conv_layer.weight) File "/home/robzelluf/.local/lib/python3.5/site-packages/torch/tensor.py", line 230, in view_as return self.view(tensor.size()) RuntimeError: invalid argument 2: size '[1024 x 512 x 3 x 3]' is invalid for input with 3837339 elements at /pytorch/aten/src/TH/THStorage.c:41

Can anyone help me with this?

Some problems with saving weights

I've been followed this repo for about a week and make it work to start training on my own data. But when I tried to save weights, I got an error:

   AttributeError: 'Darknet' object has no attribute 'seen' 

I checked in models.py. Unfortunately, self.seen and self.header_info are missing parameters in class Darknet. I looked up some materials and make them declared as:

self.seen = 0
self.header_info = torch.IntTensor([0,0,0,0,0])

add two sentences in save_weights:

    header = self.header_info
    header = header.numpy()

and changed self.header_info.tofile(fp) into header_info.tofile(fp). After all these modification, weights can be saved and loaded. But when I loaed them in detect.py , no detections can be made.

I wondered this problem is caused either my modification on seen and header_info or too few epoches. It'll be really nice of you to offer me some suggestion. Thanks for your time.

MAP

The computation of MAP in test.py actually is precision.

Only found just one loss

Hi,
I just found one loss on one scale. Could you please figure out where are the other two losses?

RuntimeError: unique is currently CPU-only, and lacks CUDA support. Pull requests welcome!

Traceback (most recent call last):
File "test.py", line 68, in
detections = non_max_suppression(detections, 80, opt.conf_thres, opt.nms_thres)
File "/PyTorch-YOLOv3/utils.py", line 73, in non_max_suppression
for c in detections[:, -1].unique():
File "/usr/local/lib/python3.5/dist-packages/torch/tensor.py", line 310, in unique
sorted=sorted, return_inverse=return_inverse)
RuntimeError: unique is currently CPU-only and lacks CUDA support. Pull requests welcome!

It executes wrong as above, it seems CPU only?
My environment is:
Python 3.5.2 (default, Nov 23 2017, 16:37:01)
[GCC 5.4.0 20160609] on linux
torch.version
'0.4.0'
4 TITanX GPU

core dumped

Core was generated by `python train.py'. Program terminated with signal SIGSEGV, Segmentation fault.
Core dumped!!!

Question about build_targets

In models.py build_targets is called by passing dim=g_dim (aka the input size) and anchors=scaled_anchors (aka anchors scaled down by stride) so that here in utils.py the IoU is computed between the groundtruth box scaled by dim and scaled_anchors (both zero centered).

This doesn't look right to me.
Shouldn't either the gt boxes be scaled by dim / stride or the scaled anchors not be scaled at all?

GPU Insufficient memory!!

Hello
My GPU is 1080T. However, There has the problem of insufficient video memory when training. I think whether the subdivisions for the every batch is need.??

a bug in datasets.py

labels[:, 3] *= w / padded_w
labels[:, 4] *= h / padded_h
to
labels[:, 3] *= float(w) / padded_w
labels[:, 4] *= float(h) / padded_h

save weights error

Traceback (most recent call last):
File "train.py", line 108, in
model.save_weights('%s/%d.weights' % (opt.checkpoint_dir, epoch))
File "F:\cmp\VehicleDetection\PyTorch-YOLOv3\models.py", line 306, in save_weights
self.header_info[3] = self.seen
File "D:\software\Anaconda3\lib\site-packages\torch\nn\modules\module.py", line 532, in getattr
type(self).name, name))
AttributeError: 'Darknet' object has no attribute 'seen'

What's the meaning of tw and th in your loss function?

When you calculate loss, the code about tw and th like this:

# Width and height
tw[b, best_n, gj, gi] = math.log(gw/anchors[best_n][0] + 1e-16)
th[b, best_n, gj, gi] = math.log(gh/anchors[best_n][1] + 1e-16)

I don't understand why you use log,and in the paper, the loss function about w and h is
image
And why loss_x and loss_y use BCELoss? It's a regression problem,I think MSELoss is applicable.
Can you explain it ? THX.

how to retrain on custom dataset

Hi,

I do have custom dataset with bounding box info, I want to retrain Yolo-tiny, how should this be possible in this minimal version?

What is the real mAP of this implementation ?

I think the mAP you mention on the readme of the project is the one you computed with your old evaluation which wasn't a mAP computation. I know there has been a new implementation recently and I used it to compute a mAP on my own version of YOLOv3. The value I get for a 0.5 threshold (VOC mAP) is 0.65 with your mAP implementation. I also wrote a mAP evaluation module and mine gives me a 0.41 VOC mAP. All these evaluations are done on the official weights and on the COCO dataset with 416x416 inputs.

The official score is 0.55, my evaluation might be wrong or it might be that my darknet implementation is missing things / have small mistakes. The fact that your evaluation gets such a high score tells me that it is almost certainly wrong though. The last option is that the author made a mistake himself which I would consider unlikely. I will review the code soon and complete the issue if I find mistakes but I thought I would mention it now since your mAP numbers are already in the readme.

Another completely different possibility is that your evaluation module is correct and the high score is explained by the fact that I use the validation set of 2014 and that the author trained the official weights on them. Which might be the most likely case because I just noticed that in the script he uses to download the coco dataset he creates a validation subset of 5k images.

I will consider the last possibility as the correct one and I'll reopen the issue if I notice errors in the evaluation module.

CrossEntropyLoss after Sigmoid on class predictions

I think the default torch.nn.CrossEntropyLoss(size_average=False) loss used between predicted and true classes is not the correct choice, and here's why:

Class predictions are passed through a sigmoid: pred_cls = torch.sigmoid(prediction[:, :, 5:]), therefore, the values are between [0, 1].
In the perfect case (where truth class = 1 and other classes = 0), the difference (in magnitude) is (and always will be at most) 1. Taking the log of the softmax (what CrossEntropyLoss does) of a vector with values between [0, 1] isn't helping the loss function to clearly detect the correct class, again, because the relative difference is small.

In my understanding, CrossEntropyLoss measures the relative difference between the truth class and other classes (i.e.: a good classification = large value for truth class and small values for other classes). For example: [0.1, 2.1, 23.1, 0.7] is a good prediction for class [3], the relative difference between the 3rd element and the rest is big. Please correct me if I am wrong.

Maybe we can just use MSELoss for classes, too?

Error in training

[Epoch 0/30, Batch 1830/1833] [Losses: x 0.018887, y 0.017350, w 0.011480, h 0.009068, conf 0.032589, cls 0.145579, total 0.234953, recall: 0.65000]
[Epoch 0/30, Batch 1830/1833] [Losses: x 0.027525, y 0.027824, w 0.010730, h 0.007892, conf 0.054383, cls 0.179986, total 0.308339, recall: 0.81111]
[Epoch 0/30, Batch 1830/1833] [Losses: x 0.012489, y 0.012089, w 0.010546, h 0.007774, conf 0.033698, cls 0.192683, total 0.269278, recall: 0.54545]
[Epoch 0/30, Batch 1830/1833] [Losses: x 0.042511, y 0.042000, w 0.023847, h 0.016442, conf 0.075596, cls 0.151306, total 0.351703, recall: 0.67361]
[Epoch 0/30, Batch 1830/1833] [Losses: x 0.055068, y 0.053277, w 0.060733, h 0.038808, conf 0.099795, cls 0.244854, total 0.552534, recall: 0.65027]
[Epoch 0/30, Batch 1831/1833] [Losses: x 0.034622, y 0.033223, w 0.014849, h 0.012658, conf 0.059653, cls 0.104131, total 0.259136, recall: 0.67568]
[Epoch 0/30, Batch 1831/1833] [Losses: x 0.038405, y 0.035833, w 0.044459, h 0.039020, conf 0.054494, cls 0.141215, total 0.353426, recall: 0.52899]
[Epoch 0/30, Batch 1831/1833] [Losses: x 0.015338, y 0.015971, w 0.013100, h 0.005587, conf 0.035706, cls 0.174440, total 0.260142, recall: 0.52381]
[Epoch 0/30, Batch 1831/1833] [Losses: x 0.023167, y 0.022108, w 0.033820, h 0.017704, conf 0.047694, cls 0.193106, total 0.337599, recall: 0.50617]
[Epoch 0/30, Batch 1831/1833] [Losses: x 0.014519, y 0.014234, w 0.011042, h 0.005328, conf 0.033018, cls 0.121948, total 0.200090, recall: 0.59259]
[Epoch 0/30, Batch 1831/1833] [Losses: x 0.022603, y 0.022197, w 0.011902, h 0.008511, conf 0.041803, cls 0.088609, total 0.195626, recall: 0.61111]
[Epoch 0/30, Batch 1831/1833] [Losses: x 0.016306, y 0.014795, w 0.009144, h 0.004840, conf 0.039758, cls 0.154830, total 0.239672, recall: 0.62963]
[Epoch 0/30, Batch 1831/1833] [Losses: x 0.021347, y 0.020986, w 0.019330, h 0.012508, conf 0.042181, cls 0.147103, total 0.263455, recall: 0.60870]
[Epoch 0/30, Batch 1831/1833] [Losses: x 0.040201, y 0.038973, w 0.015251, h 0.036656, conf 0.073650, cls 0.174954, total 0.379684, recall: 0.54815]
[Epoch 0/30, Batch 1831/1833] [Losses: x 0.026244, y 0.026605, w 0.030584, h 0.021944, conf 0.048804, cls 0.160845, total 0.315027, recall: 0.53846]
[Epoch 0/30, Batch 1831/1833] [Losses: x 0.010117, y 0.009941, w 0.004098, h 0.005762, conf 0.027249, cls 0.178949, total 0.236117, recall: 0.60000]
[Epoch 0/30, Batch 1831/1833] [Losses: x 0.037965, y 0.038500, w 0.014596, h 0.019955, conf 0.056791, cls 0.110870, total 0.278677, recall: 0.64912]
[Epoch 0/30, Batch 1831/1833] [Losses: x 0.013917, y 0.013995, w 0.008014, h 0.005817, conf 0.034943, cls 0.167149, total 0.243835, recall: 0.47222]
[Epoch 0/30, Batch 1831/1833] [Losses: x 0.027614, y 0.026907, w 0.021914, h 0.014159, conf 0.062784, cls 0.209417, total 0.362795, recall: 0.61111]
[Epoch 0/30, Batch 1831/1833] [Losses: x 0.035124, y 0.033940, w 0.018896, h 0.013856, conf 0.060784, cls 0.174728, total 0.337329, recall: 0.71053]
[Epoch 0/30, Batch 1831/1833] [Losses: x 0.023849, y 0.023081, w 0.040256, h 0.029740, conf 0.047857, cls 0.191115, total 0.355898, recall: 0.53333]
[Epoch 0/30, Batch 1832/1833] [Losses: x 0.061831, y 0.061271, w 0.036845, h 0.032102, conf 0.117929, cls 0.158679, total 0.468657, recall: 0.66667]
[Epoch 0/30, Batch 1832/1833] [Losses: x 0.048841, y 0.048758, w 0.083654, h 0.054965, conf 0.089770, cls 0.146247, total 0.472235, recall: 0.59259]
[Epoch 0/30, Batch 1832/1833] [Losses: x 0.036920, y 0.038865, w 0.014582, h 0.023073, conf 0.077787, cls 0.183660, total 0.374887, recall: 0.60000]
[Epoch 0/30, Batch 1832/1833] [Losses: x 0.031886, y 0.031555, w 0.011043, h 0.014374, conf 0.057977, cls 0.117786, total 0.264621, recall: 0.60000]
Traceback (most recent call last):
File "train.py", line 88, in
loss = model(img, target)
File "/usr/local/lib/python3.5/dist-packages/torch/nn/modules/module.py", line 491, in call
result = self.forward(*input, **kwargs)
File "/home/wang/jiayunpeng/jpy-yolo/PyTorch-YOLOv3-master/models.py", line 194, in forward
x = module(x)
File "/usr/local/lib/python3.5/dist-packages/torch/nn/modules/module.py", line 491, in call
result = self.forward(*input, **kwargs)
File "/usr/local/lib/python3.5/dist-packages/torch/nn/modules/container.py", line 91, in forward
input = module(input)
File "/usr/local/lib/python3.5/dist-packages/torch/nn/modules/module.py", line 491, in call
result = self.forward(*input, **kwargs)
File "/usr/local/lib/python3.5/dist-packages/torch/nn/modules/conv.py", line 301, in forward
self.padding, self.dilation, self.groups)
RuntimeError: input has less dimensions than expected

error with own dataset

Hello,

I found an error in the last commit:

Traceback (most recent call last):
File "train.py", line 82, in
loss = model(imgs, targets)
File "/home/alupotto/anaconda3/envs/pt4_cu9_p35/lib/python3.6/site-packages/torch/nn/modules/module.py", line 491, in call
result = self.forward(*input, **kwargs)
File "/home/alupotto/PycharmProjects/tracktorch/models.py", line 213, in forward
x, *losses = module[0](x, targets)
File "/home/alupotto/anaconda3/envs/pt4_cu9_p35/lib/python3.6/site-packages/torch/nn/modules/module.py", line 491, in call
result = self.forward(*input, **kwargs)
File "/home/alupotto/PycharmProjects/tracktorch/models.py", line 177, in forward
return loss, loss_x.item(), loss_y.item(), loss_w.item(), loss_h.item(), loss_conf.item(), loss_cls.item(), float(nCorrect/nGT)
ZeroDivisionError: division by zero

I am training with my own data/labels and I had some images that didn't have ground truth at all So the file .txt exists but inside is empty.

I just added a check control before return float(nCorrect/nGT) in models.py like this:

AP = float(nCorrect/nGT) if nGT is not 0 else 0 and then I return AP.

I mention it in case somebody has the same problem or for future updates.

About training

After running 8 epochs on my 1080ti about 6 hours, and the checkpoints folder has 7 files: [0..7].weights. I run "python detect.py --weights_path checkpoints/7.weights" to detect.But no detections, if I download the wights from "https://pjreddie.com/media/files/yolov3.weights", It can detect, So I suspect there is something wrong with the training, or I have to wait until 30 epochs.

traini with my own data

Hi, i change the utils/datasets.py and some config
i also make a mini train dataset which include 8 images and labels.
but when i train with it , i find the loss is very big(total loss 1000 +),so i cant get the right result for detection.
so, is there some possible bug in training code?

Possible bugs in YOLO Layer's forward

  1. The no object loss component of the confidence loss:
self.lambda_noobj * self.bce_loss(conf * (1 - mask), mask * (1 - mask))

basically compares conf * (1 - mask) with mask * (1 - mask), but mask * (1 - mask) will be a tensor full of zeros. I think it should be:

self.lambda_noobj * self.bce_loss(conf * (1 - mask), tconf * (1 - mask))
  1. The AP calculation has 2 problems:
  • isn't nCorrect / nGT actually the recall? precision is the number of correct predictions divided by the number of all predictions.
  • if there are no ground truths, nCorrect / nGT will be nan, so the correct expression must be:
    1 if (nCorrect == nGT == 0) else (nCorrect / nGT)

No detection after overfitting my own dataset.

I trained this model for detection of hands. and to first check the model i trained it on 32 images for 60 epochs and this is what i am getting after 60 epochs.
image
but when i ran detect.py on same dataset that i overfitted. there are no bounding boxes
i changed configuration i.e number of classes from 80 to 1, filters from 255 to 18. also the coco.data and coco.names file.
the labels are as follows
[class name, width_center, height_center , width, height]
example [0, 0.4, 0.3 , 0.2 , 0.1]

when i print detections in detect.py. it gives [nan,nan, nan.....]

But no results in output

about backward

RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn

Computing confidence mask

In build_targets function, at the beginning, there's a part that calculates the confidence mask tensor.
Initially it is set to a tensor of ones, but the update rule:

    # Objects with higher confidence than threshold are set to zero
    conf_mask[b][cur_ious.view_as(conf_mask[b]) > ignore_thres] = 0

doesn't make sense to me. This basically ignores any ious better than ignore_thres (currently set to 0.5).

I'd think that:

  • either start with a tensor of zeros and use the update rule:
    conf_mask[b][cur_ious.view_as(conf_mask[b]) > thres] = 1
  • either change the sign: conf_mask[b][cur_ious.view_as(conf_mask[b]) <= thres] = 0

Thanks

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.