open-mmlab / mmdetection Goto Github PK
View Code? Open in Web Editor NEWOpenMMLab Detection Toolbox and Benchmark
Home Page: https://mmdetection.readthedocs.io
License: Apache License 2.0
OpenMMLab Detection Toolbox and Benchmark
Home Page: https://mmdetection.readthedocs.io
License: Apache License 2.0
I need your help : )
ImportError: /home/wangshuainan/mmdetection/mmdet/ops/roi_align/roi_align_cuda.cpython-36m-x86_64-linux-gnu.so: undefined symbol: _ZN2at5ErrorC1ENS_14SourceLocationESs
Hello, Thanks for your excellent work.
I firstly use the pre-trained model of "faster_rcnn_r50_fpn_1x_20181010-3d1b3351.pth",
and test using one Titan xp on server.
the mAP show a similar result:
Then I use the following cmd to train: python ./tools/train.py ./configs/faster_rcnn_r50_fpn_1x.py --gpus 1 --work_dir ./experiments/faster_rcnn_r50_fpn_1x
On the only one Titan xp, it costs 3 days...
When I test it use the same scripts, the mAP shows an inferior result:
I guess this might because I only use one GPU to train. But the total number of mini-batch is fixed, and there exists no BN option for res50 on single GPU...I cannot figure why the mAP drops nearly 10 points...
Could you please give some advice
I used the command
python tools/test.py configs/mask_rcnn_r50_fpn_1x.py weights/mask_rcnn_r50_fpn_1x_20181010-069fa190.pth --gpus 1 --out result.pkl
to test the pretrained model. But I meet one error. The output is
loading annotations into memory...
Done (t=0.61s)
creating index...
index created!
Traceback (most recent call last):
File "tools/test.py", line 114, in
main()
File "tools/test.py", line 84, in main
shuffle=False)
File "/home/jwwangchn/anaconda2/lib/python2.7/site-packages/mmdet/datasets/loader/build_loader.py", line 28, in build_dataloader
sampler = GroupSampler(dataset, imgs_per_gpu)
File "/home/jwwangchn/anaconda2/lib/python2.7/site-packages/mmdet/datasets/loader/sampler.py", line 14, in init
assert hasattr(dataset, 'flag')
AssertionError
I have installed mmcv 0.2.0
by source. I want to know how to solve this problem.
i saw your Software environment is:
Hi, does the repository support fast rcnn?
when run the script “python tools/train.py configs/faster_rcnn_r50_fpn_1x.py --gpus 1 --work_dir logs --validate ”,meet the problem
2018-10-13 23:54:59,086 - INFO - workflow: [('train', 1)], max: 12 epochs
Segmentation fault
The error message is as follow:
In file included from <command-line>:0:0:
/usr/include/stdc-predef.h:59:1: fatal error: cuda_runtime.h: No such file or directory
#endif
^
compilation terminated.
error: command 'nvcc' failed with exit status 1
make: *** [all] Error 1
This error occurred in a Ubuntu 14.04 machine with cuda 8.0, the envirionment variables are set properly.
On another Ubuntu 16.04 machine with cuda 9.0, the compilation goes smoothly.
ImportError: /home/l547/anaconda3/lib/python3.7/site-packages/mmdet-0.5.1+810b711-py3.7.egg/mmdet/ops/nms/gpu_nms.cpython-37m-x86_64-linux-gnu.so: undefined symbol: __cudaPopCallConfiguration
I use the python scripts convert_cityscapes_to_coco.py
and successfully convert the cityscapes dataset to coco. But when I modified the config file faster_rcnn_r50_fpn_1x.py
and use the command python tools/train.py configs/faster_rcnn_r50_fpn_1x.py --gpus 2 --work_dir ./out --validate
to train, I got the error:
`2018-10-16 11:05:01,156 - INFO - Distributed training: False
2018-10-16 11:05:01,600 - INFO - load model from: modelzoo://resnet50
2018-10-16 11:05:01,825 - WARNING - unexpected key in source state_dict: fc.weight, fc.bias
missing keys in source state_dict: layer3.4.bn1.num_batches_tracked, layer1.0.downsample.1.num_batches_tracked, layer3.5.bn2.num_batches_tracked, layer3.0.bn2.num_batches_tracked, layer4.1.bn3.num_batches_tracked, layer2.1.bn1.num_batches_tracked, layer3.4.bn3.num_batches_tracked, layer1.1.bn2.num_batches_tracked, layer3.3.bn3.num_batches_tracked, layer2.2.bn3.num_batches_tracked, layer3.1.bn1.num_batches_tracked, layer1.2.bn1.num_batches_tracked, layer4.0.bn1.num_batches_tracked, layer2.3.bn2.num_batches_tracked, layer3.1.bn3.num_batches_tracked, layer4.2.bn1.num_batches_tracked, layer3.4.bn2.num_batches_tracked, layer4.0.bn2.num_batches_tracked, layer4.2.bn3.num_batches_tracked, layer2.1.bn3.num_batches_tracked, layer2.0.bn3.num_batches_tracked, layer1.0.bn1.num_batches_tracked, layer1.0.bn3.num_batches_tracked, layer4.1.bn2.num_batches_tracked, layer3.0.bn1.num_batches_tracked, layer1.0.bn2.num_batches_tracked, layer3.2.bn1.num_batches_tracked, layer2.1.bn2.num_batches_tracked, layer4.1.bn1.num_batches_tracked, layer3.0.downsample.1.num_batches_tracked, layer1.2.bn3.num_batches_tracked, layer2.0.downsample.1.num_batches_tracked, layer2.3.bn3.num_batches_tracked, layer3.3.bn2.num_batches_tracked, layer1.1.bn1.num_batches_tracked, layer4.2.bn2.num_batches_tracked, layer3.3.bn1.num_batches_tracked, layer1.1.bn3.num_batches_tracked, layer2.0.bn1.num_batches_tracked, layer3.0.bn3.num_batches_tracked, layer2.3.bn1.num_batches_tracked, layer3.1.bn2.num_batches_tracked, layer4.0.downsample.1.num_batches_tracked, layer2.0.bn2.num_batches_tracked, layer2.2.bn2.num_batches_tracked, layer3.5.bn1.num_batches_tracked, layer3.2.bn3.num_batches_tracked, layer2.2.bn1.num_batches_tracked, layer3.2.bn2.num_batches_tracked, bn1.num_batches_tracked, layer3.5.bn3.num_batches_tracked, layer4.0.bn3.num_batches_tracked, layer1.2.bn2.num_batches_tracked
loading annotations into memory...
Done (t=4.59s)
creating index...
index created!
2018-10-16 11:05:09,360 - INFO - Start running, host: chenkai@Autodrive, work_dir: /home/chenkai/Documents/mmdetection/out
2018-10-16 11:05:09,360 - INFO - workflow: [('train', 1)], max: 12 epochs
Exception ignored in: <bound method _DataLoaderIter.del of <torch.utils.data.dataloader._DataLoaderIter object at 0x7f2cce44c1d0>>
Traceback (most recent call last):
File "/home/chenkai/.virtualenvs/mmdetection_py36/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 399, in del
self._shutdown_workers()
File "/home/chenkai/.virtualenvs/mmdetection_py36/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 378, in _shutdown_workers
self.worker_result_queue.get()
File "/usr/lib/python3.6/multiprocessing/queues.py", line 337, in get
return _ForkingPickler.loads(res)
File "/home/chenkai/.virtualenvs/mmdetection_py36/lib/python3.6/site-packages/torch/multiprocessing/reductions.py", line 151, in rebuild_storage_fd
fd = df.detach()
File "/usr/lib/python3.6/multiprocessing/resource_sharer.py", line 57, in detach
with _resource_sharer.get_connection(self._id) as conn:
File "/usr/lib/python3.6/multiprocessing/resource_sharer.py", line 87, in get_connection
c = Client(address, authkey=process.current_process().authkey)
File "/usr/lib/python3.6/multiprocessing/connection.py", line 493, in Client
answer_challenge(c, authkey)
File "/usr/lib/python3.6/multiprocessing/connection.py", line 737, in answer_challenge
response = connection.recv_bytes(256) # reject large message
File "/usr/lib/python3.6/multiprocessing/connection.py", line 216, in recv_bytes
buf = self._recv_bytes(maxlength)
File "/usr/lib/python3.6/multiprocessing/connection.py", line 407, in _recv_bytes
buf = self._recv(4)
File "/usr/lib/python3.6/multiprocessing/connection.py", line 379, in _recv
chunk = read(handle, remaining)
ConnectionResetError: [Errno 104] Connection reset by peer
Traceback (most recent call last):
File "tools/train.py", line 82, in
main()
File "tools/train.py", line 78, in main
logger=logger)
File "/home/chenkai/.virtualenvs/mmdetection_py36/lib/python3.6/site-packages/mmdet-0.5.0+f29f020-py3.6.egg/mmdet/apis/train.py", line 59, in train_detector
_non_dist_train(model, dataset, cfg, validate=validate)
File "/home/chenkai/.virtualenvs/mmdetection_py36/lib/python3.6/site-packages/mmdet-0.5.0+f29f020-py3.6.egg/mmdet/apis/train.py", line 117, in _non_dist_train
runner.run(data_loaders, cfg.workflow, cfg.total_epochs)
File "/home/chenkai/.virtualenvs/mmdetection_py36/lib/python3.6/site-packages/mmcv-0.2.0-py3.6.egg/mmcv/runner/runner.py", line 349, in run
epoch_runner(data_loaders[i], **kwargs)
File "/home/chenkai/.virtualenvs/mmdetection_py36/lib/python3.6/site-packages/mmcv-0.2.0-py3.6.egg/mmcv/runner/runner.py", line 255, in train
self.model, data_batch, train_mode=True, **kwargs)
File "/home/chenkai/.virtualenvs/mmdetection_py36/lib/python3.6/site-packages/mmdet-0.5.0+f29f020-py3.6.egg/mmdet/apis/train.py", line 37, in batch_processor
losses = model(**data)
File "/home/chenkai/.virtualenvs/mmdetection_py36/lib/python3.6/site-packages/torch/nn/modules/module.py", line 477, in call
result = self.forward(*input, **kwargs)
File "/home/chenkai/.virtualenvs/mmdetection_py36/lib/python3.6/site-packages/torch/nn/parallel/data_parallel.py", line 123, in forward
outputs = self.parallel_apply(replicas, inputs, kwargs)
File "/home/chenkai/.virtualenvs/mmdetection_py36/lib/python3.6/site-packages/torch/nn/parallel/data_parallel.py", line 133, in parallel_apply
return parallel_apply(replicas, inputs, kwargs, self.device_ids[:len(replicas)])
File "/home/chenkai/.virtualenvs/mmdetection_py36/lib/python3.6/site-packages/torch/nn/parallel/parallel_apply.py", line 77, in parallel_apply
raise output
File "/home/chenkai/.virtualenvs/mmdetection_py36/lib/python3.6/site-packages/torch/nn/parallel/parallel_apply.py", line 53, in _worker
output = module(*input, **kwargs)
File "/home/chenkai/.virtualenvs/mmdetection_py36/lib/python3.6/site-packages/torch/nn/modules/module.py", line 477, in call
result = self.forward(*input, **kwargs)
File "/home/chenkai/.virtualenvs/mmdetection_py36/lib/python3.6/site-packages/mmdet-0.5.0+f29f020-py3.6.egg/mmdet/models/detectors/base.py", line 79, in forward
return self.forward_train(img, img_meta, **kwargs)
File "/home/chenkai/.virtualenvs/mmdetection_py36/lib/python3.6/site-packages/mmdet-0.5.0+f29f020-py3.6.egg/mmdet/models/detectors/two_stage.py", line 111, in forward_train
self.train_cfg.rcnn)
File "/home/chenkai/.virtualenvs/mmdetection_py36/lib/python3.6/site-packages/mmdet-0.5.0+f29f020-py3.6.egg/mmdet/models/bbox_heads/bbox_head.py", line 73, in get_bbox_target
target_stds=self.target_stds)
File "/home/chenkai/.virtualenvs/mmdetection_py36/lib/python3.6/site-packages/mmdet-0.5.0+f29f020-py3.6.egg/mmdet/core/bbox/bbox_target.py", line 25, in bbox_target
target_stds=target_stds)
File "/home/chenkai/.virtualenvs/mmdetection_py36/lib/python3.6/site-packages/mmdet-0.5.0+f29f020-py3.6.egg/mmdet/core/utils/misc.py", line 24, in multi_apply
return tuple(map(list, zip(*map_results)))
File "/home/chenkai/.virtualenvs/mmdetection_py36/lib/python3.6/site-packages/mmdet-0.5.0+f29f020-py3.6.egg/mmdet/core/bbox/bbox_target.py", line 62, in proposal_target_single
labels, reg_num_classes)
File "/home/chenkai/.virtualenvs/mmdetection_py36/lib/python3.6/site-packages/mmdet-0.5.0+f29f020-py3.6.egg/mmdet/core/bbox/bbox_target.py", line 75, in expand_target
bbox_targets_expand[i, start:end] = bbox_targets[i, :]
RuntimeError: The expanded size of the tensor (0) must match the existing size (4) at non-singleton dimension 0`
Could you help me to solve this problem? @hellock
Detectron.pytorch allows us to finetune from Detectron weights. Is this possible for this repo or do you provide the pretrained weights which are trained in caffe style? Somehow the performance of caffe style models is better.
when i change type of bbox_head in cfg from 'SharedFCRoIHead' to 'ConvFCRoIHead'. the above error occurs.
class ConvFCRoIHead(BBoxHead):
"""More general bbox head, with shared conv and fc layers and two optional
separated branches.
/-> cls convs -> cls fcs -> cls
shared convs -> shared fcs
\-> reg convs -> reg fcs -> reg
"""
def __init__(self,
num_shared_convs=0,
num_shared_fcs=0,
num_cls_convs=0,
num_cls_fcs=0,
num_reg_convs=0,
num_reg_fcs=0,
conv_out_channels=256,
fc_out_channels=1024,
normalize=None, # add this line
with_bias=False, # add this line
*args,
**kwargs):
super(ConvFCRoIHead, self).__init__(*args, **kwargs)
assert (num_shared_convs + num_shared_fcs + num_cls_convs + num_cls_fcs
+ num_reg_convs + num_reg_fcs > 0)
if num_cls_convs > 0 or num_reg_convs > 0:
assert num_shared_fcs == 0
if not self.with_cls:
assert num_cls_convs == 0 and num_cls_fcs == 0
if not self.with_reg:
assert num_reg_convs == 0 and num_reg_fcs == 0
self.num_shared_convs = num_shared_convs
self.num_shared_fcs = num_shared_fcs
self.num_cls_convs = num_cls_convs
self.num_cls_fcs = num_cls_fcs
self.num_reg_convs = num_reg_convs
self.num_reg_fcs = num_reg_fcs
self.conv_out_channels = conv_out_channels
self.fc_out_channels = fc_out_channels
self.normalize = normalize # add this line
self.with_bias = with_bias # add this line
bbox_head=dict(
type='ConvFCRoIHead',
num_shared_convs=2,
num_shared_fcs=0,
num_cls_convs=1,
num_cls_fcs=2,
num_reg_convs=1,
num_reg_fcs=2,
conv_out_channels=256,
fc_out_channels=1024,
normalize={'type': 'BN'},
# BBoxHead
in_channels=256,
roi_feat_size=7,
num_classes=11,
target_means=[0., 0., 0., 0.],
target_stds=[0.1, 0.1, 0.2, 0.2],
reg_class_agnostic=False))
Good design choice for Iteration pipeline.
As for TwoStageDetector, especially the pyramid structure in FPN, lots of multi-image argsort in rpn_head will be bottleneck when you attempt to enlarge the batchsize. This affects training speed and GPU utility dramatically as the batchsize getting larger.
The SNIPER uses half-precision training and smaller input image size allows him has a larger batchsize even on a single GPU, he adopts the two stages training process instead of end-to-end to cope with this bottleneck.
I have tried to speed up the rpn_head by multi-processing or multi-threading in python directly. While, because of the existence of the fork overhead and the GIL, both of them seems not a choice.
Do you have any idea on speeding up the rpn_head directly?
I'm training Mask RCNN and got this error during the 9th epoch. It seems like a dataloader deadlock.
I also encountered dataloader deadlock using Detectron.pytorch code before and the solution was to train with 1 img/gpu. (Check this issue)
Any idea what might cause this problem? I'm not sure whether it is a PyTorch dataloader problem or the dataset function: __getitem()__ problem.
Thanks in advance.
I compiled and test images use this example, but when I import, I go this error?
Traceback (most recent call last):
File "tools/test.py", line 9, in
from mmdet.core import results2json, coco_eval
File "/home/yu/.virtualenvs/Pytorch/lib/python3.5/site-packages/mmdet/core/init.py", line 6, in
from .post_processing import * # noqa: F401, F403
File "/home/yu/.virtualenvs/Pytorch/lib/python3.5/site-packages/mmdet/core/post_processing/init.py", line 1, in
from .bbox_nms import multiclass_nms
File "/home/yu/.virtualenvs/Pytorch/lib/python3.5/site-packages/mmdet/core/post_processing/bbox_nms.py", line 3, in
from mmdet.ops import nms
File "/home/yu/.virtualenvs/Pytorch/lib/python3.5/site-packages/mmdet/ops/init.py", line 2, in
from .roi_align import RoIAlign, roi_align
File "/home/yu/.virtualenvs/Pytorch/lib/python3.5/site-packages/mmdet/ops/roi_align/init.py", line 1, in
from .functions.roi_align import roi_align
File "/home/yu/.virtualenvs/Pytorch/lib/python3.5/site-packages/mmdet/ops/roi_align/functions/roi_align.py", line 3, in
from .. import roi_align_cuda
ImportError: cannot import name 'roi_align_cuda'
I check floder, no roi_align_cuda, How I solve it? Messges in my compile:
Building roi align op...
running build_ext
building 'roi_align_cuda' extension
creating build
creating build/temp.linux-x86_64-3.5
creating build/temp.linux-x86_64-3.5/src
x86_64-linux-gnu-gcc -pthread -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -I/home/yu/.virtualenvs/Pytorch/lib/python3.5/site-packages/torch/lib/include -I/home/yu/.virtualenvs/Pytorch/lib/python3.5/site-packages/torch/lib/include/TH -I/home/yu/.virtualenvs/Pytorch/lib/python3.5/site-packages/torch/lib/include/THC -I/usr/local/cuda/include -I/usr/include/python3.5m -I/home/yu/.virtualenvs/Pytorch/include/python3.5m -c src/roi_align_cuda.cpp -o build/temp.linux-x86_64-3.5/src/roi_align_cuda.o -DTORCH_EXTENSION_NAME=roi_align_cuda -std=c++11
cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++
src/roi_align_cuda.cpp: In function ‘int roi_align_forward_cuda(at::Tensor, at::Tensor, int, int, float, int, at::Tensor)’:
src/roi_align_cuda.cpp:20:80: error: ‘AT_CHECK’ was not declared in this scope
#define CHECK_CUDA(x) AT_CHECK(x.type().is_cuda(), #x, " must be a CUDAtensor ")
^
src/roi_align_cuda.cpp:24:3: note: in expansion of macro ‘CHECK_CUDA’
CHECK_CUDA(x);
^
src/roi_align_cuda.cpp:31:3: note: in expansion of macro ‘CHECK_INPUT’
CHECK_INPUT(features);
^
src/roi_align_cuda.cpp: In function ‘int roi_align_backward_cuda(at::Tensor, at::Tensor, int, int, float, int, at::Tensor)’:
src/roi_align_cuda.cpp:20:80: error: ‘AT_CHECK’ was not declared in this scope
#define CHECK_CUDA(x) AT_CHECK(x.type().is_cuda(), #x, " must be a CUDAtensor ")
^
src/roi_align_cuda.cpp:24:3: note: in expansion of macro ‘CHECK_CUDA’
CHECK_CUDA(x);
^
src/roi_align_cuda.cpp:59:3: note: in expansion of macro ‘CHECK_INPUT’
CHECK_INPUT(top_grad);
^
error: command 'x86_64-linux-gnu-gcc' failed with exit status 1
Building roi pool op...
running build_ext
building 'roi_pool_cuda' extension
creating build
creating build/temp.linux-x86_64-3.5
creating build/temp.linux-x86_64-3.5/src
x86_64-linux-gnu-gcc -pthread -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -I/home/yu/.virtualenvs/Pytorch/lib/python3.5/site-packages/torch/lib/include -I/home/yu/.virtualenvs/Pytorch/lib/python3.5/site-packages/torch/lib/include/TH -I/home/yu/.virtualenvs/Pytorch/lib/python3.5/site-packages/torch/lib/include/THC -I/usr/local/cuda/include -I/usr/include/python3.5m -I/home/yu/.virtualenvs/Pytorch/include/python3.5m -c src/roi_pool_cuda.cpp -o build/temp.linux-x86_64-3.5/src/roi_pool_cuda.o -DTORCH_EXTENSION_NAME=roi_pool_cuda -std=c++11
cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++
src/roi_pool_cuda.cpp: In function ‘int roi_pooling_forward_cuda(at::Tensor, at::Tensor, int, int, float, at::Tensor, at::Tensor)’:
src/roi_pool_cuda.cpp:19:80: error: ‘AT_CHECK’ was not declared in this scope
#define CHECK_CUDA(x) AT_CHECK(x.type().is_cuda(), #x, " must be a CUDAtensor ")
^
src/roi_pool_cuda.cpp:23:3: note: in expansion of macro ‘CHECK_CUDA’
CHECK_CUDA(x);
^
src/roi_pool_cuda.cpp:30:3: note: in expansion of macro ‘CHECK_INPUT’
CHECK_INPUT(features);
^
src/roi_pool_cuda.cpp: In function ‘int roi_pooling_backward_cuda(at::Tensor, at::Tensor, at::Tensor, float, at::Tensor)’:
src/roi_pool_cuda.cpp:19:80: error: ‘AT_CHECK’ was not declared in this scope
#define CHECK_CUDA(x) AT_CHECK(x.type().is_cuda(), #x, " must be a CUDAtensor ")
^
src/roi_pool_cuda.cpp:23:3: note: in expansion of macro ‘CHECK_CUDA’
CHECK_CUDA(x);
^
src/roi_pool_cuda.cpp:57:3: note: in expansion of macro ‘CHECK_INPUT’
CHECK_INPUT(top_grad);
^
error: command 'x86_64-linux-gnu-gcc' failed with exit status 1
Building nms op...
rm .so
echo "Compiling nms kernels..."
Compiling nms kernels...
python setup.py build_ext --inplace
running build_ext
building 'cpu_nms' extension
x86_64-linux-gnu-gcc -pthread -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -I/usr/local/cuda/include -I/home/yu/.virtualenvs/Pytorch/lib/python3.5/site-packages/numpy/core/include -I/usr/include/python3.5m -I/home/yu/.virtualenvs/Pytorch/include/python3.5m -c cpu_nms.cpp -o build/temp.linux-x86_64-3.5/cpu_nms.o -Wno-unused-function -Wno-write-strings
cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++
In file included from /home/yu/.virtualenvs/Pytorch/lib/python3.5/site-packages/numpy/core/include/numpy/ndarraytypes.h:1821:0,
from /home/yu/.virtualenvs/Pytorch/lib/python3.5/site-packages/numpy/core/include/numpy/ndarrayobject.h:18,
from /home/yu/.virtualenvs/Pytorch/lib/python3.5/site-packages/numpy/core/include/numpy/arrayobject.h:4,
from cpu_nms.cpp:621:
/home/yu/.virtualenvs/Pytorch/lib/python3.5/site-packages/numpy/core/include/numpy/npy_1_7_deprecated_api.h:15:2: warning: #warning "Using deprecated NumPy API, disable it by " "#defining NPY_NO_DEPRECATED_API NPY_1_7_API_VERSION" [-Wcpp]
#warning "Using deprecated NumPy API, disable it by "
^
x86_64-linux-gnu-g++ -pthread -shared -Wl,-O1 -Wl,-Bsymbolic-functions -Wl,-Bsymbolic-functions -Wl,-z,relro -Wl,-Bsymbolic-functions -Wl,-z,relro -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 build/temp.linux-x86_64-3.5/cpu_nms.o -L/usr/local/cuda/lib64 -lcudart -o /home/yu/mmdetection/mmdet/ops/nms/cpu_nms.cpython-35m-x86_64-linux-gnu.so
building 'gpu_nms' extension
x86_64-linux-gnu-gcc -pthread -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -I/usr/local/cuda/include -I/home/yu/.virtualenvs/Pytorch/lib/python3.5/site-packages/numpy/core/include -I/usr/include/python3.5m -I/home/yu/.virtualenvs/Pytorch/include/python3.5m -c gpu_nms.cpp -o build/temp.linux-x86_64-3.5/gpu_nms.o -Wno-unused-function -Wno-write-strings
cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++
In file included from /home/yu/.virtualenvs/Pytorch/lib/python3.5/site-packages/numpy/core/include/numpy/ndarraytypes.h:1821:0,
from /home/yu/.virtualenvs/Pytorch/lib/python3.5/site-packages/numpy/core/include/numpy/ndarrayobject.h:18,
from /home/yu/.virtualenvs/Pytorch/lib/python3.5/site-packages/numpy/core/include/numpy/arrayobject.h:4,
from gpu_nms.cpp:623:
/home/yu/.virtualenvs/Pytorch/lib/python3.5/site-packages/numpy/core/include/numpy/npy_1_7_deprecated_api.h:15:2: warning: #warning "Using deprecated NumPy API, disable it by " "#defining NPY_NO_DEPRECATED_API NPY_1_7_API_VERSION" [-Wcpp]
#warning "Using deprecated NumPy API, disable it by "
^
/usr/local/cuda/bin/nvcc -I/usr/local/cuda/include -I/home/yu/.virtualenvs/Pytorch/lib/python3.5/site-packages/numpy/core/include -I/usr/include/python3.5m -I/home/yu/.virtualenvs/Pytorch/include/python3.5m -c nms_kernel.cu -o build/temp.linux-x86_64-3.5/nms_kernel.o -arch=sm_52 --ptxas-options=-v -c --compiler-options -fPIC
ptxas info : 0 bytes gmem
ptxas info : Compiling entry function '_Z10nms_kernelifPKfPy' for 'sm_52'
ptxas info : Function properties for _Z10nms_kernelifPKfPy
128 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
ptxas info : Used 38 registers, 20480 bytes smem, 344 bytes cmem[0], 12 bytes cmem[2]
x86_64-linux-gnu-g++ -pthread -shared -Wl,-O1 -Wl,-Bsymbolic-functions -Wl,-Bsymbolic-functions -Wl,-z,relro -Wl,-Bsymbolic-functions -Wl,-z,relro -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 build/temp.linux-x86_64-3.5/gpu_nms.o build/temp.linux-x86_64-3.5/nms_kernel.o -L/usr/local/cuda/lib64 -lcudart -o /home/yu/mmdetection/mmdet/ops/nms/gpu_nms.cpython-35m-x86_64-linux-gnu.so
building 'cpu_soft_nms' extension
x86_64-linux-gnu-gcc -pthread -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -I/usr/local/cuda/include -I/home/yu/.virtualenvs/Pytorch/lib/python3.5/site-packages/numpy/core/include -I/usr/include/python3.5m -I/home/yu/.virtualenvs/Pytorch/include/python3.5m -c cpu_soft_nms.cpp -o build/temp.linux-x86_64-3.5/cpu_soft_nms.o -Wno-unused-function -Wno-write-strings
cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++
In file included from /home/yu/.virtualenvs/Pytorch/lib/python3.5/site-packages/numpy/core/include/numpy/ndarraytypes.h:1821:0,
from /home/yu/.virtualenvs/Pytorch/lib/python3.5/site-packages/numpy/core/include/numpy/ndarrayobject.h:18,
from /home/yu/.virtualenvs/Pytorch/lib/python3.5/site-packages/numpy/core/include/numpy/arrayobject.h:4,
from cpu_soft_nms.cpp:621:
/home/yu/.virtualenvs/Pytorch/lib/python3.5/site-packages/numpy/core/include/numpy/npy_1_7_deprecated_api.h:15:2: warning: #warning "Using deprecated NumPy API, disable it by " "#defining NPY_NO_DEPRECATED_API NPY_1_7_API_VERSION" [-Wcpp]
#warning "Using deprecated NumPy API, disable it by "
^
cpu_soft_nms.cpp: In function ‘PyObject __pyx_pf_12cpu_soft_nms_cpu_soft_nms(PyObject*, PyArrayObject*, float, float, float, unsigned int)’:
cpu_soft_nms.cpp:2450:34: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
__pyx_t_11 = ((__pyx_v_pos < __pyx_v_N) != 0);
^
cpu_soft_nms.cpp:2961:34: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
__pyx_t_11 = ((__pyx_v_pos < __pyx_v_N) != 0);
^
x86_64-linux-gnu-g++ -pthread -shared -Wl,-O1 -Wl,-Bsymbolic-functions -Wl,-Bsymbolic-functions -Wl,-z,relro -Wl,-Bsymbolic-functions -Wl,-z,relro -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 build/temp.linux-x86_64-3.5/cpu_soft_nms.o -L/usr/local/cuda/lib64 -lcudart -o /home/yu/mmdetection/mmdet/ops/nms/cpu_soft_nms.cpython-35m-x86_64-linux-gnu.so
"\Python\Python36\lib\site-packages\mmdet\datasets\loader\build_loader.py", line 10, in
import resource
ModuleNotFoundError: No module named 'resource'
Q: Nowhere to find the module resource and I can ensure that it is not another package of Python. Would any one tell me where the module is? Disappear? or uncompleted?
The CornerNet is a new model for object detection, and it's code has already been opensourced on github. Is it easy to add this model to mmdetection? @hellock @OceanPang
As pytorch already announced pytorch 1.0, it would be state-of-art if api can running on newest version pytorch
Are there any plans to support PASCAL VOC data set?
thanks
When I test my trained model using the following command, I get something wrong
python tools/test.py configs/mask_rcnn_r50_fpn_1x.py ./work_dirs/mask_rcnn_r50_fpn_1x/latest.pth --gpus 2 --eval proposal_fast --out results.pkl
loading annotations into memory...
Done (t=0.27s)
creating index...
index created!
[>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>] 5000/5000, 11.5 task/s, elapsed: 436s, ETA: 0s
writing results to results.pkl
Starting evaluate proposal_fast
Traceback (most recent call last):
File "tools/test.py", line 114, in <module>
main()
File "tools/test.py", line 110, in main
coco_eval(result_file, eval_types, dataset.coco)
File "/home/jiangboyuan/mmdetection/mmdet/core/evaluation/coco_utils.py", line 20, in coco_eval
ar = fast_eval_recall(result_file, coco, np.array(max_dets))
File "/home/jiangboyuan/mmdetection/mmdet/core/evaluation/coco_utils.py", line 73, in fast_eval_recall
gt_bboxes, results, max_dets, iou_thrs, print_summary=False)
File "/home/jiangboyuan/mmdetection/mmdet/core/evaluation/recall.py", line 86, in eval_recalls
if proposals[i].ndim == 2 and proposals[i].shape[1] == 5:
AttributeError: 'tuple' object has no attribute 'ndim'
Any suggestion about how to resolve it?
hi
using ./tools/test.py can not evaluate segmentation result?
or how to show the segmentation result.
Thanks1!
Traceback (most recent call last):
File "tools/test.py", line 121, in
main(arguments)
File "tools/test.py", line 86, in main
outputs = single_test(model, data_loader, args.show)
File "tools/test.py", line 20, in single_test
result = model(return_loss=False, rescale=not show, **data)
File "/usr/local/lib/python3.5/dist-packages/torch/nn/modules/module.py", line 477, in call
result = self.forward(*input, **kwargs)
File "/usr/local/lib/python3.5/dist-packages/torch/nn/parallel/data_parallel.py", line 121, in forward
return self.module(*inputs[0], **kwargs[0])
File "/usr/local/lib/python3.5/dist-packages/torch/nn/modules/module.py", line 477, in call
result = self.forward(*input, **kwargs)
File "/usr/local/lib/python3.5/dist-packages/mmdet/models/detectors/base.py", line 81, in forward
return self.forward_test(img, img_meta, **kwargs)
File "/usr/local/lib/python3.5/dist-packages/mmdet/models/detectors/base.py", line 73, in forward_test
return self.simple_test(imgs[0], img_metas[0], **kwargs)
File "/usr/local/lib/python3.5/dist-packages/mmdet/models/detectors/two_stage.py", line 149, in simple_test
self.test_cfg.rpn) if proposals is None else proposals
File "/usr/local/lib/python3.5/dist-packages/mmdet/models/detectors/test_mixins.py", line 10, in simple_test_rpn
proposal_list = self.rpn_head.get_proposals(*proposal_inputs)
File "/usr/local/lib/python3.5/dist-packages/mmdet/models/rpn_heads/rpn_head.py", line 197, in get_proposals
img_meta[img_id]['img_shape'], cfg)
File "/usr/local/lib/python3.5/dist-packages/mmdet/models/rpn_heads/rpn_head.py", line 229, in _get_proposals_single
self.target_stds, img_shape)
File "/usr/local/lib/python3.5/dist-packages/mmdet/core/bbox/transforms.py", line 54, in delta2bbox
gw = pw * dw.exp()
RuntimeError: arguments are located on different GPUs at /pytorch/aten/src/THC/generated/../generic/THCTensorMathPointwise.cu:314
def parse_args(args):
parser = argparse.ArgumentParser(description='MMDet test detector')
parser.add_argument('config', help='test config file path')
parser.add_argument('checkpoint', help='checkpoint file')
parser.add_argument(
'--gpus', default=1, type=int, help='GPU number used for testing')
parser.add_argument(
'--proc_per_gpu',
default=1,
type=int,
help='Number of processes per GPU')
parser.add_argument('--out', help='output result file')
parser.add_argument(
'--eval',
type=str,
nargs='+',
choices=['proposal', 'proposal_fast', 'bbox', 'segm', 'keypoints'],
help='eval types')
parser.add_argument('--show', action='store_true', help='show results')
args = parser.parse_args(args)
return args
def main(args):
args = parse_args(args)
if args.out is not None and not args.out.endswith(('.pkl', '.pickle')):
raise ValueError('The output file must be a pkl file.')
cfg = mmcv.Config.fromfile(args.config)
cfg.model.pretrained = None
cfg.data.test.test_mode = True
dataset = obj_from_dict(cfg.data.test, datasets, dict(test_mode=True))
if args.gpus == 1:
model = build_detector(
cfg.model, train_cfg=None, test_cfg=cfg.test_cfg)
load_checkpoint(model, args.checkpoint)
model = MMDataParallel(model, device_ids=[1])
print('using cuda 1')
data_loader = build_dataloader(
dataset,
imgs_per_gpu=1,
workers_per_gpu=cfg.data.workers_per_gpu,
num_gpus=1,
dist=False,
shuffle=False)
outputs = single_test(model, data_loader, args.show)
else:
model_args = cfg.model.copy()
model_args.update(train_cfg=None, test_cfg=cfg.test_cfg)
model_type = getattr(detectors, model_args.pop('type'))
outputs = parallel_test(
model_type,
model_args,
args.checkpoint,
dataset,
_data_func,
range(args.gpus),
workers_per_gpu=args.proc_per_gpu)
if args.out:
print('writing results to {}'.format(args.out))
mmcv.dump(outputs, args.out)
eval_types = args.eval
if eval_types:
print('Starting evaluate {}'.format(' and '.join(eval_types)))
if eval_types == ['proposal_fast']:
result_file = args.out
else:
result_file = args.out + '.json'
results2json(dataset, outputs, result_file)
coco_eval(result_file, eval_types, dataset.coco)
if __name__ == '__main__':
arguments = [
'configs/faster_rcnn_r50_fpn_1x.py',
'data/faster_rcnn_r50_fpn_1x/epoch_12.pth',
'--gpus=1',
'--out=test.pkl'
]
main(arguments)
def infer():
import mmcv
from mmcv.runner import load_checkpoint
from mmdet.models import build_detector
from mmdet.apis import inference_detector, show_result
from mmcv.parallel import scatter, collate, MMDataParallel
cfg = mmcv.Config.fromfile('configs/faster_rcnn_r50_fpn_1x.py')
cfg.model.pretrained = None
# construct the model and load checkpoint
model = build_detector(cfg.model, test_cfg=cfg.test_cfg)
print(model)
checkpoint_path = 'data/faster_rcnn_r50_fpn_1x/epoch_12.pth'
_ = load_checkpoint(model, checkpoint_path, map_location='cuda:1')
model = MMDataParallel(model, device_ids=[1])
# test a single image
img_path = '/workspace/nas/test/'
img = mmcv.imread(img_path+'0.jpg')
result = inference_detector(model, img, cfg, device='cuda:1')
show_result(img, result)
# test a list of images
# img_path = '/workspace/nas/test/'
# imgs = ['0.jpg', '1.jpg']
# imgs = [img_path+img for img in imgs]
# for i, result in enumerate(inference_detector(model, imgs, cfg, device='cuda:2')):
# print(i, imgs[i])
# show_result(imgs[i], result)
Do you plan to support cpu predict? Thank you very much!
For example, I have multiple annotation files (in json format) for different datasets. I want to train/eval simultaneously on those datasets by providing the paths of the annotation files like what Detectron.pytorch does.
What is the easy way to achieve this with mmdetection?
My idea is to make use of torch.utils.data.ConcatDataset
and modify the ann_file
to support multiple ann_files in config.
I wonder the time field of the training log is in minutes or in seconds? As below:
2018-10-23 10:51:07,015 - INFO - Epoch [3][50/2500] lr: 0.02000, time: 1.090, data_time: 0.015, loss_reg: 0.0096, acc: 99.0479, loss_cls: 0.0230, loss_rpn_cls: 0.0012, loss_rpn_reg: 0.0055, loss: 0.0394
2018-10-23 10:51:58,164 - INFO - Epoch [3][100/2500] lr: 0.02000, time: 1.023, data_time: 0.006, loss_reg: 0.0099, acc: 99.0254, loss_cls: 0.0234, loss_rpn_cls: 0.0011, loss_rpn_reg: 0.0060, loss: 0.0403
2018-10-23 10:52:50,521 - INFO - Epoch [3][150/2500] lr: 0.02000, time: 1.047, data_time: 0.007, loss_reg: 0.0097, acc: 98.9961, loss_cls: 0.0237, loss_rpn_cls: 0.0011, loss_rpn_reg: 0.0064, loss: 0.0408
Hi,
Thanks for sharing this great work.
I tried your code to train a Faster RCNN FPN with ResNet-50 detector with this config file configs/faster_rcnn_r50_fpn_1x.py
by running
python tools/train.py ./configs/faster_rcnn_r50_fpn_1x.py --gpus 4 --work_dir ./output --validate
And I tested the model with
python tools/test.py ./configs/faster_rcnn_r50_fpn_1x.py ./output/latest.pth --gpus 4 --out ./output/results.pkl --eval bbox
.
I got this result:
Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.356
Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.571
Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.382
Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.204
Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.393
Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.450
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.304
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.489
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.515
Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.331
Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.556
Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.644
I noticed the result (35.6) is lower than the one reported in the MODEL_ZOO (36.4).
I also tested on the 11th epoch and the result is 35.0.
Do you think it is normal? I don't think the variance should be this large.
Thanks the team very much for sharing the code!
When I read the code, I cannot find the FisherNet backbone and the Guided Anchoring design. Will these parts be released in the future, or just they lie in somewhere that I haven't noticed?
hello! Thanks for your nice work.
If i want to train using my own dataset, what should i do?
Our dataset is pascal voc format medical image dataset.
Training
from mmcv import Config
cfg = Config.fromfile('xx_config.py')
# users can overwrite some config values here
model = build_detector(cfg.model, train_cfg=cfg.train_cfg)
train_detector(model)
Inference
from mmcv import Config
from mmcv.runner import load_checkpoint
cfg = Config.fromfile('xx_config.py')
model = build_detector(cfg.model, train_cfg=cfg.train_cfg)
# another method to construct a model is "model = FasterRCNN(**kwargs)"
load_checkpoint(model, 'checkpoint.pth')
bboxes = inference_detector(model, 'a.jpg', device='cuda:0')
I find the _non_dist_train()
function in mmdet/apis/train.py
does not use the parameter 'validate', so I could not see the validate result even though I use the parameter '--validate'.
Notice in class "TwoStageDetector", the init_weights function does not call "self.mask_head.init_weights".
Is this by design?
def sigmoid_focal_loss(pred, target, weight, gamma=2.0, alpha=0.25, reduction='elementwise_mean'): pred_sigmoid = pred.sigmoid() pt = (1 - pred_sigmoid) * target + pred_sigmoid * (1 - target) weight = (alpha * target + (1 - alpha) * (1 - target)) * weight weight = weight * pt.pow(gamma) return F.binary_cross_entropy_with_logits( pred, target, weight, reduction=reduction)
There is an input named weight of the focal loss. Could you explain what this weight is and how I can get it. Thank you very much
I try to train a model resume from checkpoint epoch_5.pth, but got the following message:
Traceback (most recent call last): File "tools/train.py", line 84, in <module> main() File "tools/train.py", line 80, in main logger=logger) File "/home/rusu5516/anaconda3/envs/pytorch4/lib/python3.5/site-packages/mmdet-0.5.0+c21ff08-py3.5.egg/mmdet/apis/train.py", line 59, in train_detector _non_dist_train(model, dataset, cfg, validate=validate) File "/home/rusu5516/anaconda3/envs/pytorch4/lib/python3.5/site-packages/mmdet-0.5.0+c21ff08-py3.5.egg/mmdet/apis/train.py", line 117, in _non_dist_train runner.run(data_loaders, cfg.workflow, cfg.total_epochs) File "/home/rusu5516/anaconda3/envs/pytorch4/lib/python3.5/site-packages/mmcv-0.2.0-py3.5.egg/mmcv/runner/runner.py", line 349, in run epoch_runner(data_loaders[i], **kwargs) File "/home/rusu5516/anaconda3/envs/pytorch4/lib/python3.5/site-packages/mmcv-0.2.0-py3.5.egg/mmcv/runner/runner.py", line 262, in train self.call_hook('after_train_iter') File "/home/rusu5516/anaconda3/envs/pytorch4/lib/python3.5/site-packages/mmcv-0.2.0-py3.5.egg/mmcv/runner/runner.py", line 222, in call_hook getattr(hook, fn_name)(self) File "/home/rusu5516/anaconda3/envs/pytorch4/lib/python3.5/site-packages/mmcv-0.2.0-py3.5.egg/mmcv/runner/hooks/optimizer.py", line 20, in after_train_iter runner.optimizer.step() File "/home/rusu5516/anaconda3/envs/pytorch4/lib/python3.5/site-packages/torch/optim/sgd.py", line 101, in step buf.mul_(momentum).add_(1 - dampening, d_p) RuntimeError: The expanded size of the tensor (480) must match the existing size (64) at non-singleton dimension 1
when I run ./compile.sh, I got the following errors.
`/home/rusu5516/miniconda3/envs/pytorch4/lib/python3.5/site-packages/torch/lib/include/ATen/TensorMethods.h:646:36: required from here
/usr/include/c++/6/tuple:483:67: error: mismatched argument pack lengths while expanding ‘std::is_constructible<_Elements, _UElements&&>’
return _and<is_constructible<_Elements, _UElements&&>...>::value;
^~~~~
/usr/include/c++/6/tuple:484:1: error: body of constexpr function ‘static constexpr bool std::_TC<, _Elements>::_MoveConstructibleTuple() [with _UElements = {std::tuple<at::Tensor, at::Tensor, at::Tensor>}; bool = true; _Elements = {at::Tensor, at::Tensor, at::Tensor}]’ not a return-statement
}
/home/rusu5516/miniconda3/envs/pytorch4/lib/python3.5/site-packages/torch/lib/include/ATen/TensorMethods.h:646:36: required from here
/usr/include/c++/6/tuple:489:65: error: mismatched argument pack lengths while expanding ‘std::is_convertible<_UElements&&, _Elements>’
return _and<is_convertible<_UElements&&, _Elements>...>::value;
^~~~~
/usr/include/c++/6/tuple:490:1: error: body of constexpr function ‘static constexpr bool std::_TC<, _Elements>::_ImplicitlyMoveConvertibleTuple() [with _UElements = {std::tuple<at::Tensor, at::Tensor, at::Tensor>}; bool = true; _Elements = {at::Tensor, at::Tensor, at::Tensor}]’ not a return-statement
}
`
and plenty of these.
when I train the model, I run:
python tools/train.py /home1/clx/mmdetection/configs/mask_rcnn_r50_fpn_1x.py --gpus 2 --work_dir /home1/clx/mmdetection/logs/logs_mask/ --validat
but after the training process, in the work_dir, there exists a file named "20181019_155702.log"
Good design choice for Iteration pipeline.
As for TwoStageDetector, especially the pyramid structure in FPN, lots of multi-image argsort in rpn_head will be bottleneck when you attempt to enlarge the batchsize. This affects training speed and GPU utility dramatically as the batchsize getting larger.
The SNIPER uses half-precision training and smaller input image size allows him has a larger batchsize even on a single GPU, he adopts the two stages training process instead of end-to-end to cope with this bottleneck.
I have tried to speed up the rpn_head by multi-processing or multi-threading in python directly. While, because of the existence of the fork overhead and the GIL, both of them seems not a choice.
Do you have any idea on speeding up the rpn_head directly?
Thx.
Code refactoring.
Hi, hellock
Tanks for bring such great project to us! but, where can I find the retinanet code, do you release this one. I personal need a pytorch version retinanet as a baseline in my experiment, and there is a deadline. Can you help me? Thanks a lot!!!
Hi,
Is it possible to run mmdetection
on CPU (without GPU)?
Hi.
Thank you for the great work. I found there were some strange behaviours when I was trying to train faster R-CNN with imgs_per_gpu=3 (although the number 3 is inappropriate to some extent). One is that in the very beginning training accuracy is above ~95%, which is relatively high compared to ~89% with imgs_per_gpu=2 or 4. And the other is that after validation epoch, an error related to indices happened.
Loading and preparing results... Traceback (most recent call last): File "./tools/train.py", line 81, in <module> main() File "./tools/train.py", line 77, in main logger=logger) File "/home/root/miniconda3/envs/mytorch/lib/python3.6/site-packages/mmdet/apis/train.py", line 57, in train_detector _dist_train(model, dataset, cfg, validate=validate) File "/home/root/miniconda3/envs/mytorch/lib/python3.6/site-packages/mmdet/apis/train.py", line 92, in _dist_train runner.run(data_loaders, cfg.workflow, cfg.total_epochs) File "/home/root/miniconda3/envs/mytorch/lib/python3.6/site-packages/mmcv/runner/runner.py", line 349, in run epoch_runner(data_loaders[i], **kwargs) File "/home/root/miniconda3/envs/mytorch/lib/python3.6/site-packages/mmcv/runner/runner.py", line 265, in train self.call_hook('after_train_epoch') File "/home/root/miniconda3/envs/mytorch/lib/python3.6/site-packages/mmcv/runner/runner.py", line 222, in call_hook getattr(hook, fn_name)(self) File "/home/root/miniconda3/envs/mytorch/lib/python3.6/site-packages/mmdet/core/evaluation/eval_hooks.py", line 93, in after_train_epoch self.evaluate(runner, results) File "/home/root/miniconda3/envs/mytorch/lib/python3.6/site-packages/mmdet/core/evaluation/eval_hooks.py", line 134, in evaluate cocoDt = cocoGt.loadRes(tmp_file) File "/home/root/miniconda3/envs/mytorch/lib/python3.6/site-packages/pycocotools-2.0-py3.6-linux-x86_64.egg/pycocotools/coco.py", line 318, in loadRes if 'caption' in anns[0]: IndexError: list index out of range
Hi
I have problems when running the 'compile.sh'. I got the following error:
`/home/yao/.local/lib/python2.7/site-packages/torch/lib/include/THC/THCAtomics.cuh(100): error: cannot overload functions distinguished by return type alone
/home/yao/.local/lib/python2.7/site-packages/torch/lib/include/THC/THCAtomics.cuh(123): error: return value type does not match the function type
2 errors detected in the compilation of "/tmp/tmpxft_0000150c_00000000-4_roi_align_kernel.cpp4.ii".
error: command '/usr/local/cuda/bin/nvcc' failed with exit status 2
`
It seems that the problems is with the cuda version. What cuda version has been used testing this package? I am using cuda 10.0 and pytorch 0.4.1.
hi
using ./tools/test.py can not evaluate segmentation result?
or how to show the segmentation result.
Thanks!
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.