Giter Club home page Giter Club logo

yolact's People

Contributors

agade09 avatar breznak avatar chongzhou96 avatar dbolya avatar dte avatar glenn-jocher avatar shafu0x avatar zllrunning avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

yolact's Issues

more experienments would be nicer

the paper says that box2pix relies on an extremely light-weight backbone detector.
I think more experienments maybe nicer. maybe like this
kitti cityscape coco
box2pix
yolact

also ,yolact-lite maybe good,just like yolo-lite using light-weight backbone(like xception).
this is the yolact v1 just like yolo v1.
I am wondering if the encoder-decoder achitecture or the atrous convolution may help which is adopped by deeplab v3 plus.
expecting yolact v2...

IndexError: list index out of range

Hello!
I trained this model with own dataset, but it fails in the mAP evaluation phase, does anyone have the same problem?

(tensorflow) root@gpuserver:/home/gpuserver/models/yolact# python train.py --config=yolact_base_config
loading annotations into memory...
Done (t=0.03s)
creating index...
index created!
loading annotations into memory...
Done (t=0.00s)
creating index...
index created!
Initializing weights...
Begin training!

/root/anaconda3/envs/tensorflow/lib/python3.6/site-packages/torch/nn/parallel/_functions.py:61: UserWarning: Was asked to gather along dimension 0, but all input tensors were scalars; will instead unsqueeze and return a vector.
warnings.warn('Was asked to gather along dimension 0, but all '
[ 0] 0 || B: 5.480 | C: 23.075 | M: 5.976 | S: 67.004 | T: 101.536 || ETA: 0:00:00 || timer: 23.377
[ 0] 10 || B: 4.757 | C: 18.774 | M: 5.625 | S: 47.000 | T: 76.155 || ETA: 11 days, 7:25:02 || timer: 1.176
[ 0] 20 || B: 4.587 | C: 15.804 | M: 5.362 | S: 29.147 | T: 54.900 || ETA: 11 days, 7:50:00 || timer: 1.180
[ 0] 30 || B: 4.582 | C: 13.355 | M: 5.309 | S: 19.954 | T: 43.199 || ETA: 11 days, 6:14:29 || timer: 1.272
[ 0] 40 || B: 4.553 | C: 11.175 | M: 5.306 | S: 15.150 | T: 36.183 || ETA: 11 days, 6:35:18 || timer: 1.266
[ 0] 50 || B: 4.497 | C: 9.617 | M: 5.303 | S: 12.227 | T: 31.645 || ETA: 11 days, 6:12:37 || timer: 1.120
[ 0] 60 || B: 4.433 | C: 8.514 | M: 5.290 | S: 10.265 | T: 28.503 || ETA: 11 days, 4:48:22 || timer: 1.166
[ 1] 70 || B: 4.383 | C: 7.700 | M: 5.304 | S: 8.850 | T: 26.237 || ETA: 11 days, 7:53:19 || timer: 1.236
[ 1] 80 || B: 4.339 | C: 7.073 | M: 5.269 | S: 7.781 | T: 24.464 || ETA: 11 days, 6:55:00 || timer: 1.173
[ 1] 90 || B: 4.294 | C: 6.585 | M: 5.250 | S: 6.945 | T: 23.074 || ETA: 11 days, 6:22:39 || timer: 1.217
[ 1] 100 || B: 4.235 | C: 6.015 | M: 5.230 | S: 5.666 | T: 21.147 || ETA: 11 days, 5:30:35 || timer: 1.259
[ 1] 110 || B: 4.131 | C: 4.426 | M: 5.184 | S: 1.178 | T: 14.920 || ETA: 11 days, 4:57:38 || timer: 1.177
[ 1] 120 || B: 4.045 | C: 3.427 | M: 5.202 | S: 0.242 | T: 12.915 || ETA: 11 days, 4:43:12 || timer: 1.214
[ 2] 130 || B: 3.926 | C: 2.860 | M: 5.195 | S: 0.192 | T: 12.174 || ETA: 11 days, 5:57:53 || timer: 2.714
[ 2] 140 || B: 3.817 | C: 2.654 | M: 5.138 | S: 0.180 | T: 11.789 || ETA: 11 days, 5:43:35 || timer: 1.230
[ 2] 150 || B: 3.694 | C: 2.571 | M: 5.045 | S: 0.170 | T: 11.480 || ETA: 11 days, 5:23:29 || timer: 1.217
[ 2] 160 || B: 3.617 | C: 2.516 | M: 4.966 | S: 0.158 | T: 11.256 || ETA: 11 days, 5:37:45 || timer: 1.277
[ 2] 170 || B: 3.540 | C: 2.467 | M: 4.876 | S: 0.149 | T: 11.031 || ETA: 11 days, 5:16:56 || timer: 1.222
[ 2] 180 || B: 3.440 | C: 2.419 | M: 4.831 | S: 0.141 | T: 10.831 || ETA: 11 days, 4:54:42 || timer: 1.176
[ 2] 190 || B: 3.342 | C: 2.364 | M: 4.716 | S: 0.135 | T: 10.558 || ETA: 11 days, 4:41:58 || timer: 1.187

Computing validation mAP (this may take a while)...

Traceback (most recent call last):
File "train.py", line 374, in
train()
File "train.py", line 303, in train
compute_validation_map(yolact_net, val_dataset)
File "train.py", line 367, in compute_validation_map
eval_script.evaluate(yolact_net, dataset, train_mode=True)
File "/home/gpuserver/models/yolact/eval.py", line 791, in evaluate
prep_metrics(ap_data, preds, img, gt, gt_masks, h, w, num_crowd, dataset.ids[image_idx], detections)
File "/home/gpuserver/models/yolact/eval.py", line 401, in prep_metrics
ap_obj = ap_data[iou_type][iouIdx][_class]
IndexError: list index out of range

Support on Multi-GPU?

Hi, dbolya,

I did not find dataparallel in your yolact.py, which define the model. So the code in your repo did not support multi-gpu properly?
I tried simple CUDA_VISIBLE_DEVICES to assign multi-gpu, but the performance is not right according to the train log.

Thanks!

Inference speed problem on my own environment

First of all, thanks for sharing the amazing work!
Following the instructions, I have deployed the environment and can execute the code successfully, however, when running eval.py, the inference speed is slower than expected.
For model ResNet101-FPN, when testing on validation set of coo, the code return about 9 FPS, and when testing on my own images of kinect (640*480), with ploting and saving disabled, the code return about 14 FPS.
my own evironment is : GTX1080, cuda8.0, cudatoolkits8.0, I am using anaconda, gpu support is checked via

torch.cuda.is_available()

I am a newer for pytorch, so I am wondering there is some configuration or dependencies have missed.

Thanks!

How to see graph structure?

Hi sir.
I want to see the data flow to understand this article. However, I nerver use torch. Could you send me a graph logdir by tensorboardX? Thank you in advance.

Computational time with own code

Hi, thank you for the awesome work!
For some reasons, I have to re-write your eval.py by myself.
However, if I run the code, it will take 2 seconds just for prediction.
Do you have any idea why is it?

I already checked I enabled GPU.


import os
from data import COCODetection, MEANS, COLORS, COCO_CLASSES
from yolact import Yolact
from utils.augmentations import BaseTransform, FastBaseTransform, Resize
from utils.functions import MovingAverage, ProgressBar
from layers.box_utils import jaccard, center_size
from utils import timer
from utils.functions import SavePath
from layers.output_utils import postprocess, undo_image_transformation
import pycocotools

from data import cfg, set_cfg, set_dataset

import numpy as np
import torch
import torch.backends.cudnn as cudnn
from torch.autograd import Variable
import argparse
import time
import random
import cProfile
import pickle
import json
import os
from pathlib import Path
from collections import OrderedDict
from PIL import Image

import matplotlib.pyplot as plt
import time

set_cfg("yolact_resnet50_config")
with torch.no_grad():
    torch.cuda.set_device(1)
    cudnn.benchmark = True
    cudnn.fastest = True
    torch.set_default_tensor_type('torch.cuda.FloatTensor')
    net = Yolact()
    net.load_weights('./weights/yolact_resnet50_54_800000.pth')
    net.eval()
    net = net.cuda()
print('model loaded...')

#run your code
def execute(rgb_image):
    net.detect.cross_class_nms = True
    net.detect.use_fast_nms = True
    cfg.mask_proto_debug = False
    with torch.no_grad():
        frame = torch.Tensor(rgb_image).cuda().float()
        batch = FastBaseTransform()(frame.unsqueeze(0))
        time_start = time.clock()
        preds = net(batch)
        time_elapsed = (time.clock() - time_start)
        h, w, _ = rgb_image.shape
        t = postprocess(preds, w, h, visualize_lincomb=False, crop_masks=True, score_threshold=0)
        torch.cuda.synchronize()
        
        classes, scores, boxes, masks = [x[:MAX_MASK_SIZE].cpu().numpy() for x in t]

        print(time_elapsed)

Issue while running eval.py scripts

I am running this on a linux 18.04 box with python3 and all the most recent versions of the libraries. Any Idea why I get this error?

$ python3 eval.py --trained_model=weights/yolact_base_54_800000.pth --score_threshold=0.3 --top_k=100 --video=/home/vib/Desktop/AndurilSRC/LPR_DATA/lotsofcars_1.mp4:output_video-det.mp4

Config not specified. Parsed yolact_base_config from the file name.

Loading model... Done.
Traceback (most recent call last):
File "eval.py", line 935, in
evaluate(net, dataset)
File "eval.py", line 722, in evaluate
savevideo(net, inp, out)
File "eval.py", line 682, in savevideo
preds = net(batch)
File "/home/vib/.local/lib/python3.6/site-packages/torch/nn/modules/module.py", line 489, in call
result = self.forward(*input, **kwargs)
File "/home/vib/Desktop/Personal/yolact/yolact.py", line 612, in forward
return self.detect(pred_outs)
File "/home/vib/Desktop/Personal/yolact/layers/functions/detection.py", line 76, in call
result = self.detect(batch_idx, conf_preds, decoded_boxes, mask_data, inst_data)
File "/home/vib/Desktop/Personal/yolact/layers/functions/detection.py", line 103, in detect
boxes, masks, classes, scores = self.fast_nms(boxes, masks, scores, self.nms_thresh, self.top_k)
File "/home/vib/Desktop/Personal/yolact/layers/functions/detection.py", line 148, in fast_nms
iou.triu_(diagonal=1)
RuntimeError: invalid argument 1: expected a matrix at /pytorch/aten/src/THC/generic/THCTensorMathPairwise.cu:203
FAIL

Training speed

I am not having consistent GPU utilization, and it says 18 days for 1 v100 gpu(p3.2xlarge) with batchsize of 12 and num-workers 8. Does this make sense?

Is there any explanation of timer column and is there tensorboard equivalent for viewing performance over time?

Thank you very much!

evaluation model download URL

hi dbolya,

Can u upload your model on Google drive or other disk? The URL provided by ucdavis. is not accessable

Thanks

About Training Implenmentation detail of yolact

Thanks for sharing your your great work!
I compared yolact's training config with that of retinanet since yolact is based on retinanet(I think)
I have a few questions about the training config of yolact.
(1) the batch size on one GPU is 8, so how many GPUs did you use when training? 4 or 8? which means that total batch size is 32 or 64. Retinanet's batch size is 16.
(2) the iterations is 800k, which is almost 10x larger than retinanet. why?
(3) the learning rate is 1e-3, which is 10 times smaller than retinanet, why?

Thanks!

KeyError while trying to retrain on Pascal

Hello

I am facing a little issue.
I am trying to retrain the model on Pascal Voc 2012 dataset.
I took the coco like annotations from this source:
https://github.com/facebookresearch/multipathnet

Then I follow the instruction concerning the modification to do in the file config.py

But when I call : python train.py --config=yolact_base_config

I receive the following error:

KeyError: 'Traceback (most recent call last):\n File "/home/smile/anaconda3/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 138, in _worker_loop\n samples = collate_fn([dataset[i] for i in batch_indices])\n File "/home/smile/anaconda3/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 138, in <listcomp>\n samples = collate_fn([dataset[i] for i in batch_indices])\n File "/hdd1/prog/yolact/data/coco.py", line 88, in __getitem__\n im, gt, masks, h, w, num_crowds = self.pull_item(index)\n File "/hdd1/prog/yolact/data/coco.py", line 145, in pull_item\n target = self.target_transform(target, width, height)\n File "/hdd1/prog/yolact/data/coco.py", line 39, in __call__\n label_idx = self.label_map[obj[\'category_id\']] - 1\nKeyError: 12\n'

The error is quite not clear to me.

So what I did is create a new dataset:

PASCAL_VOC_CLASSES = ("aeroplane", "bicycle", "bird", "boat", "bottle",
		      "bus", "car", "cat", "chair", "cow", "diningtable",
        	      "dog", "horse", "motorbike", "person", "pottedplant",
		      "sheep", "sofa", "train", "tvmonitor")


PASCAL_VOC_LABEL_MAP = { 1:  1,  2:  2,  3:  3,  4:  4,  5:  5,  6:  6,  7:  7,  8:  8,
                   9:  9, 10: 10, 11: 11, 13: 12, 14: 13, 15: 14, 16: 15, 17: 16,
                  18: 17, 19: 18, 20: 19, 21: 20}

pascalvoc2012_dataset = dataset_base.copy({
    'name': 'PASCAL VOC 2012',
    
    'train_images':'/media/smile/45C142AD782A7053/Datasets/PASCAL_VOC/VOC2012/VOCdevkit/VOC2012/JPEGImages/',
    'train_info':'/home/smile/multipathnet/data/annotations/pascal_train2012.json',

    'valid_images':'/media/smile/45C142AD782A7053/Datasets/PASCAL_VOC/VOC2012/VOCdevkit/VOC2012/JPEGImages/',
    'valid_info':'/home/smile/multipathnet/data/annotations/pascal_val2012.json',

    'label_map': PASCAL_VOC_LABEL_MAP
})

I created a new base_config that only which call the dataset I previously created with the proper number of classes:

pascalvoc_base_config = Config({
    'dataset': pascalvoc2012_dataset,
    'num_classes': 21, # This should include the background class
...

All the other fields are let untouch.

Finally I adapted yolact_base_config:

#yolact_base_config = coco_base_config.copy({
yolact_base_config = pascalvoc_base_config.copy({
    'name': 'yolact_base',

    # Dataset stuff
#    'dataset': coco2017_dataset,
#    'num_classes': len(coco2017_dataset.class_names) + 1,

    'dataset': pascalvoc2012_dataset,
    'num_classes': len(pascalvoc2012_dataset.class_names) + 1,

Here also all the other fields are let untouch.

EDIT

After applying the modifications discussed here the dataset configuration in order to train Pascal Voc is:

MEANS_PV = (103.17, 111.70, 116.69)
STD_PV = (61.11, 59.89, 61.00)

PASCAL_VOC_CLASSES = ("aeroplane", "bicycle", "bird", "boat", "bottle",
		      "bus", "car", "cat", "chair", "cow", "diningtable",
        	      "dog", "horse", "motorbike", "person", "pottedplant",
		      "sheep", "sofa", "train", "tvmonitor")


PASCAL_VOC_LABEL_MAP = { 1:  1,  2:  2,  3:  3,  4:  4,  5:  5,  6:  6,  7:  7,  8:  8,
                   9:  9, 10: 10, 11: 11, 12: 12, 13: 13, 14: 14, 15: 15, 16: 16,
                  17: 17, 18: 18, 19: 19, 20: 20}

pascalvoc2012_dataset = dataset_base.copy({
    'name': 'PASCAL VOC 2012',
    
    'train_images':'/media/smile/45C142AD782A7053/Datasets/PASCAL_VOC/VOC2012/VOCdevkit/VOC2012/JPEGImages/',
    'train_info':'/home/smile/multipathnet/data/annotations/pascal_train2012.json',

    'valid_images':'/media/smile/45C142AD782A7053/Datasets/PASCAL_VOC/VOC2012/VOCdevkit/VOC2012/JPEGImages/',
    'valid_info':'/home/smile/multipathnet/data/annotations/pascal_val2012.json',

    'label_map': PASCAL_VOC_LABEL_MAP,
    'class_names': PASCAL_VOC_CLASSES,
})

??? a bug when i training

[ 0] 3180 || B: 3.273 | C: 6.118 | M: 5.300 | S: 1.431 | T: 16.121 || ETA: 8 days, 0:27:19 || timer: 0.833
[ 0] 3190 || B: 3.251 | C: 6.134 | M: 5.046 | S: 1.343 | T: 15.774 || ETA: 8 days, 0:20:56 || timer: 0.924
[ 0] 3200 || B: 3.220 | C: 6.074 | M: 5.023 | S: 1.346 | T: 15.663 || ETA: 8 days, 0:14:25 || timer: 0.922
[ 0] 3210 || B: 3.249 | C: 6.012 | M: 4.997 | S: 1.397 | T: 15.655 || ETA: 8 days, 0:03:42 || timer: 0.824
[ 0] 3220 || B: 3.167 | C: 5.980 | M: 4.841 | S: 1.368 | T: 15.355 || ETA: 7 days, 23:56:10 || timer: 0.831
/opt/conda/conda-bld/pytorch_1550813258230/work/aten/src/THCUNN/BCECriterion.cu:57: void bce_updateOutput_no_reduce_functor<Dtype, Acctype>::operator()(const Dtype *, const Dt
ype *, Dtype *) [with Dtype = float, Acctype = float]: block: [33,0,0], thread: [192,0,0] Assertion *input >= 0. && *input <= 1. failed.
/opt/conda/conda-bld/pytorch_1550813258230/work/aten/src/THCUNN/BCECriterion.cu:57: void bce_updateOutput_no_reduce_functor<Dtype, Acctype>::operator()(const Dtype *, const Dt
ype *, Dtype *) [with Dtype = float, Acctype = float]: block: [33,0,0], thread: [193,0,0] Assertion *input >= 0. && *input <= 1.THCudaCheck FAIL file=/opt/conda/conda-bld/pyt orch_1550813258230/work/aten/src/THC/generated/../THCReduceAll.cuh line=317 error=59 : device-side assert triggered failed.

can you help me solve it? Thanks

Not able to get 30+ fps processing speed on Nvidia RTX 2080 GPU

Hello, first off, thank you for sharing this amazing work. Much appreciated.

I wanted to report in that I also could not get 30+fps on an Nvidia RTX 2080 GPU with 8GB RAM. I am getting 8-10fps with video and with images, I get ~16fps (0.06sec/image) with the Resnet-101 model, ~20fps (0.05sec/image) with the Resnet-50 model and 17-18fps (0.055sec/image) with the Darket53 model. This is quite impressive but its roughly 1/2 of what is reported in the paper. For images, I used the python timeit module to wrap the evalimage function to report my numbers. Also, it is weird that the difference in speed between the different models is not significant (especially between Resnet-101 and Resnet-50), which indicates to me that something is reducing the processing speed by ~1/2 for all the models.

The command I am using is as below (except I change the model name as needed):

python3 eval.py --trained_model=weights/yolact_resnet50_54_800000.pth --score_threshold=0.4 --top_k=100 --images=./test_images:./test_output_images

I also tried using --benchmark but there is no change in the numbers above.

I was wondering if I could get some help to figure this out.

A issue for custom dataset

Hi, thanks for your work. Recently I am trying to train the net using my custom dataset. There is an issue that I find it hard to debug it by myself. Here is my problem. Thanks a lot for your help again.

[ 2] 2930 || B: 3.808 | C: 2.416 | M: 4.821 | S: 0.049 | T: 11.094 || ETA: 4 days, 14:47:05 || timer: 0.478
[ 2] 2940 || B: 3.795 | C: 2.418 | M: 4.838 | S: 0.049 | T: 11.101 || ETA: 4 days, 14:47:10 || timer: 0.497
[ 2] 2950 || B: 3.787 | C: 2.421 | M: 4.812 | S: 0.049 | T: 11.069 || ETA: 4 days, 14:49:01 || timer: 0.474
[ 2] 2960 || B: 3.778 | C: 2.422 | M: 4.846 | S: 0.049 | T: 11.095 || ETA: 4 days, 14:49:52 || timer: 0.512
[ 2] 2970 || B: 3.748 | C: 2.419 | M: 4.846 | S: 0.048 | T: 11.061 || ETA: 4 days, 14:49:04 || timer: 0.491

Computing validation mAP (this may take a while)...

Traceback (most recent call last):
File "train.py", line 377, in
train()
File "train.py", line 300, in train
compute_validation_map(yolact_net, val_dataset)
File "train.py", line 370, in compute_validation_map
eval_script.evaluate(yolact_net, dataset, train_mode=True)
File "/data/pancreas/root/yolact-master/eval.py", line 869, in evaluate
prep_metrics(ap_data, preds, img, gt, gt_masks, h, w, num_crowd, dataset.ids[image_idx], detections)
File "/data/pancreas/root/yolact-master/eval.py", line 433, in prep_metrics
ap_obj = ap_data[iou_type][iouIdx][_class]
IndexError: list index out of range

how to use your scripts to generate my own anchor sizes and scales?

Dear Sir:
I have some problem to understand your cluster_bbox_sizes.py, optimize_bboxes.py and bbox_recall.py. I really want use them to set the parameters: scales aspect_ratios and conv_sizes more reasonable.
Could you please explain a little of what these means? Thanks a lot!

I use the default paras as the yolact_base.cfg does, and test the scripts on a dataset
scales = [ [24],[48],[96],[192],[384] ] aspect_ratios = [ [[1, 1/sqrt(2), sqrt(2)]] ]*5 conv_sizes = [(69, 69), (35, 35), (18, 18), (9, 9),(5,5)]
here are the results:
from: cluster
`0.062 (18) aspect ratios:
17.71 (8)
5.23 (8)
109.76 (2)

0.146 (70) aspect ratios:
4.39 (34)
2.26 (30)
0.65 (6)

0.241 (125) aspect ratios:
1.12 (103)
0.23 (21)
0.00 (1)
`

from optimize_bbox:

`(Iteration 9) Aspect Ratios: [[[19.03, 0.55, 1.13]], [[13.94, 13.64, 14.24]], [[13.94, 13.64, 14.24]], [[13.94, 13.64, 14.24]], [[13.94, 13.64, 14.24]]]

scales = [[17.53], [60.94], [108.94], [204.94], [396.94]]

aspect_ratios = [[[19.03, 0.55, 1.13]], [[13.94, 13.64, 14.24]], [[13.94, 13.64, 14.24]], [[13.94, 13.64, 14.24]], [[13.94, 13.64, 14.24]]]
`

from bbox_recall:

`Total recall: 33.80

small recall: 0.00
medium recall: 0.00
large recall: 46.75
`

Thanks a lot! It's a bit hard for me >o<

eval.py does not process all 5k images

When I run:

python eval.py --trained_model=weights/yolact_base_54_800000.pth --dataset=coco2017_dataset

It only evaluates 4952 images. Any ideas on why it does't go though the 5000 images in ./data/coco/images/ ?

The image folder has 5000 images and the annotations_val2017.json file has annotations for those images.

What do I need to change so that it evaluates the complete set of images? (5k)

Problems encountered while training my own dataset

Hi,
In order to solve the stacking problem of the same object, I have trained my data set as required, but there are some masks that cannot completely cover the object, only part of them can be covered. Do you know what this is about? Do you have any Suggestions for modification?
Looking forward to your reply,thank you.

training my dataset with multi gpus

Hi, thanks for your good job!
I want to train my dataset, and using for 4gpus, but I find it slower than single gpu(same batch_size), why?

MemoryError

memory is 12G,only used 8G

python train.py --config=yolact_base_config --batch_size=5
loading annotations into memory...
Done (t=0.06s)
creating index...
index created!
loading annotations into memory...
Done (t=0.02s)
creating index...
index created!
Initializing weights...
Begin training!

[ 0] 0 || B: 8.264 | C: 14.452 | M: 14.870 | S: 3.010 | T: 40.595 || ETA: 0:00:00 || timer: 12.147
[ 0] 10 || B: 9.251 | C: 9.149 | M: 7.010 | S: 2.204 | T: 27.615 || ETA: 0:57:52 || timer: 0.445
[ 0] 20 || B: 8.156 | C: 7.494 | M: 6.613 | S: 1.537 | T: 23.800 || ETA: 1:00:17 || timer: 0.441
[ 0] 30 || B: 8.053 | C: 6.515 | M: 6.317 | S: 1.206 | T: 22.091 || ETA: 1:08:55 || timer: 0.437
[ 0] 40 || B: 7.631 | C: 5.865 | M: 6.203 | S: 0.981 | T: 20.680 || ETA: 1:22:37 || timer: 0.428
[ 0] 50 || B: 7.558 | C: 5.397 | M: 6.149 | S: 0.845 | T: 19.949 || ETA: 1:20:02 || timer: 0.432
Traceback (most recent call last):
File "train.py", line 374, in
train()
File "train.py", line 211, in train
for datum in data_loader:
File "/home/chase/anaconda3/envs/maskrcnn_benchmark1/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 637, in next
return self._process_next_batch(batch)
File "/home/chase/anaconda3/envs/maskrcnn_benchmark1/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 658, in _process_next_batch
raise batch.exc_type(batch.exc_msg)
MemoryError: Traceback (most recent call last):
File "/home/chase/anaconda3/envs/maskrcnn_benchmark1/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 138, in _worker_loop
samples = collate_fn([dataset[i] for i in batch_indices])
File "/home/chase/anaconda3/envs/maskrcnn_benchmark1/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 138, in
samples = collate_fn([dataset[i] for i in batch_indices])
File "/home/chase/yolact/data/coco.py", line 88, in getitem
im, gt, masks, h, w, num_crowds = self.pull_item(index)
File "/home/chase/yolact/data/coco.py", line 151, in pull_item
{'num_crowds': num_crowds, 'labels': target[:, 4]})
File "/home/chase/yolact/utils/augmentations.py", line 658, in call
return self.augment(img, masks, boxes, labels)
File "/home/chase/yolact/utils/augmentations.py", line 54, in call
img, masks, boxes, labels = t(img, masks, boxes, labels)
File "/home/chase/yolact/utils/augmentations.py", line 380, in call
current_masks = masks[mask, :, :].copy()
MemoryError

Is YOLACT feasible on mobile devices?

First of all, I would like to thank you for your outstanding contribution. Secondly, I would like to ask how the algorithm you proposed works on mobile devices with insufficient computing power and computing memory. Could you give me some reasonable Suggestions? Thank you so much!

How to run eval.py without cuda?

Hello, I'm trying to run eval.py, but got an error.
The error message is:

Traceback (most recent call last):
File "eval.py", line 990, in
torch.set_default_tensor_type('torch.cuda.FloatTensor')
File "/home/administrator/anaconda3/lib/python3.7/site-packages/torch/init.py", line 158, in set_default_tensor_type
_C._set_default_tensor_type(t)
File "/home/administrator/anaconda3/lib/python3.7/site-packages/torch/cuda/init.py", line 161, in _lazy_init
_check_driver()
File "/home/administrator/anaconda3/lib/python3.7/site-packages/torch/cuda/init.py", line 75, in _check_driver
raise AssertionError("Torch not compiled with CUDA enabled")
AssertionError: Torch not compiled with CUDA enabled

I don't have gpu graphic card on my pc, and how to run eval.py without cuda? Thanks.

a very very very strange problem on windows

I think thers is something incompatible with windows in yolact.py

run the eval.py says cuda unkown error, the error locates at 'torch.set_default_tensor_type('torch.cuda.FloatTensor')'. It looks like cuda init unsuccessfully.

I try to put 'torch.set_default_tensor_type('torch.cuda.FloatTensor')' from top line to down,like this:

#try1
import torch
torch.set_default_tensor_type('torch.cuda.FloatTensor')
from data import COCODetection, get_label_map, MEANS, COLORS
...

#try2
import torch
...
torch.set_default_tensor_type('torch.cuda.FloatTensor')
from yolact import Yolact
...

then I find that put it before 'from yolact import Yolact' works, otherwise failed.

Now, at the begin of yolact.py, write as follow:

import torch
torch.set_default_tensor_type('torch.cuda.FloatTensor')
from data import COCODetection, get_label_map, MEANS, COLORS
...

How you produce 'maskcoefficients'

Hi,Thanks a lot for your fantastic work!But,i found that in your paper ,you produce 'mask coefficients'
by using fc layers.but in your code, i found you produce 'mask coefficients' by using conv layer.Can you tell me which kind of layer you use for producing 'mask coefficients'?Thanks for your reply!

what inspire you the prototypenet?

I know the retinanet inspire the basic backbone, ssd inspire the loss, mask-rcnn inspire the branch,
but I wonder what inspire you the protonet?

Fine tuning with existing model

Hi,

I tried to train a model with a custom dataset and the resnet101 backbone. I noticed that while half of the bounding boxes looked accurate, the masks were completely off. I checked drew the annotations and verified that they are correct.

It could be due to the size of the dataset: 1357 images and 21 classes. I would like to use yolact_im700_54_80000.pth and fine tune it with my custom classes to see if this improves my results. What would be the steps to do this?

Compute Validation Loss

Hi, is there a way to get validation loss during training? I want to monitor it for overfitting cases.

I noticed you had it before (which is giving me errors), but the overhaul has removed it.

Thanks.

preserve_aspect_ratio question

I am training on cityscapes, so I want to preserver the ratio.(1024, 2048)
However, after turn on preserve ratio, loss keep decrease but the visualization of bounding box position always wrong.

And I find this line use max_size both at width and height.
I think it should be b_w, b_h = (int(cfg.max_size / r_w * w), int(cfg.min_size / r_h * h)).
or directly b_w, b_h =w, h
I don't understand the comment # A hack to scale the bboxes to the right size
I wonder is this a bug or some trick?

b_w, b_h = (cfg.max_size / r_w * w, cfg.max_size / r_h * h)

Thanks

Training time is long?

Hi, dbolya.

Thanks for your work. I tried to reproduce the performance with ResNet50 pre-trained model and used the command 'python train.py --config=yolact_resnet50_config'. While training, I found that it need about 30 days to finish the training which was too long. Then I set batch_size = 32 because I have 8 GPUs, but it remains the same. The total training time was still about 30 days.

Did I do anything wrong? Or the training time is actually long? How can I use Multi-GPU to accelerate training?

Thanks!

Custom dataset runtime error

Hello

I am trying to retrain yolact on Pascal Part a variation of Pascal VOC where each classes has many sub-classes.
To simplify everything I make every sub-classes a class in addition with the 20 original one which give me a set 316 classes.
I generated three JSON files for each case.

When I start training I encouter the following error:
RuntimeError: cannot perform reduction function max on tensor with no elements because the operation does not have an identity
Which happen here:
losses = criterion(out, wrapper, wrapper.make_mask())
train.py around line 262 (I had some print in my file so my line number is different)

Here:
eriklindernoren/PyTorch-YOLOv3#110

I read it might be a path issue however I rechecked the image path are correct.
Also I am able to train Pascal Voc using the same image path without issues.

I try to investigate the forward method of the loss function looking for an empty tensor but I did not find any.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.