Giter Club home page Giter Club logo

fewshotdetection's Introduction

Few-Shot Object Detection

(ECCV 2020) PyTorch implementation of paper "Few-Shot Object Detection and Viewpoint Estimation for Objects in the Wild"
[PDF] [Project webpage] [Code (Viewpoint)]

teaser

If our project is helpful for your research, please consider citing:

@INPROCEEDINGS{Xiao2020FSDetView,
    author    = {Yang Xiao and Renaud Marlet},
    title     = {Few-Shot Object Detection and Viewpoint Estimation for Objects in the Wild},
    booktitle = {European Conference on Computer Vision (ECCV)},
    year      = {2020}}

ChangeLog

  • [Dec-15 2020] There could be a problem for the COCO base model weight downloaded in download_models.sh, please download the correct model weight from here and put it in save_models/COCO/.

Tabel of Contents

Installation

Code built on top of MetaR-CNN.

Requirements

  • CUDA 8.0
  • Python=3.6
  • PyTorch=0.4.0
  • torchvision=0.2.1
  • gcc >= 4.9

Build

  • Create conda env:
conda create --name FSdetection --file spec-file.txt
conda activate FSdetection
  • Compile the CUDA dependencies:
cd {repo_root}/lib
sh make.sh

Data Preparation

We evaluate our method on two commonly-used benchmarks. See data/README.md for more details.

PASCAL VOC

We use the train/val sets of PASCAL VOC 2007+2012 for training and the test set of PASCAL VOC 2007 for evaluation. We split the 20 object classes into 15 base classes and 5 novel classes, and we consider 3 splits proposed in FSRW.

Download PASCAL VOC 2007+2012, create softlink named VOCdevkit in the folder data/.

COCO

We use COCO 2014 and keep the 5k images from minival set for evaluation and use the rest for training. We use the 20 object classes that are the same with PASCAL VOC as novel classes and use the rest as base classes.

Download COCO 2014, create softlink named coco in the folder data/. Please follow the instructins here to download the instances_minival2014.json and instances_valminusminival2014.json.

Getting Started

1. Base-Class Training

Pre-trained ResNet: folloing Meta R-CNN, we used ResNet101 for PASCAL VOC and ResNet50 for MS-COCO. Download it and put it into the data/pretrained_model/.

  • We provide pre-trained models of base-class training:
bash download_models.sh

You will get a dir like:

save_models/
    COCO/
    VOC_first/
    VOC_second/
    VOC_third/
  • You can also train it yourself:
# the first split on VOC
bash run/train_voc_first.sh

# the second split on VOC
bash run/train_voc_second.sh

# the third split on VOC
bash run/train_voc_third.sh

# NonVOC / VOC split on COCO
bash run/train_coco.sh

2. Few-Shot Fine-tuning

Fine-tune the base-training models on a balanced training data including both base and novel classes (3K instances per base class and K instances per novel class):

bash run/finetune_voc_first.sh

bash run/finetune_voc_second.sh

bash run/finetune_voc_third.sh

bash run/finetune_coco.sh

3. Testing

Evaluation is conducted on the test set of PASCAL VOC 2007 or minival set of COCO 2014:

bash run/test_voc_first.sh

bash run/test_voc_second.sh

bash run/test_voc_third.sh

bash run/test_coco.sh

Quantitative Results

Multiple Runs

By running multiple times (~10) the few-shot fine-tuning experiments and averaging the results, we got the performance below:

Pascal-VOC (AP@50)

Split-1 (Base) Split-1 (Novel) Split-2 (Base) Split-2 (Novel) Split-3 (Base) Split-3 (Novel)
K=1 64.2 24.2 66.9 21.6 66.7 21.1
K=2 67.8 35.3 69.9 24.6 69.1 30.0
K=3 69.4 42.2 70.8 31.9 69.9 37.2
K=5 69.8 49.1 71.4 37.0 70.9 43.8
K=10 71.1 57.4 72.2 45.7 72.2 49.6

MS-COCO

AP (Base) AP@50 (Base) AP@75 (Base) AP (Novel) AP@50 (Novel) AP@75 (Novel)
K=1 3.6 9.8 1.7 4.5 12.4 2.2
K=2 5.0 13.0 2.7 6.6 17.1 3.5
K=3 5.9 14.7 3.9 7.2 18.7 3.7
K=5 8.6 20.3 6.0 10.7 24.5 6.7
K=10 10.5 23.3 8.2 12.5 27.3 9.8
K=30 12.7 26.1 9.7 14.7 30.6 12.2

Specific Split

For a direct and quick comparison on COCO, we also run experiments using the specific few-shot sample split provided in TFA.

  • Download their json files into the annotation folder of COCO:
cd ./data/coco/annotations
mkdir TFA && cd TFA
wget -r --no-parent  http://dl.yf.io/fs-det/datasets/cocosplit/
mv dl.yf.io/fs-det/datasets/cocosplit/ cocosplit && rm -r dl.yf.io 
  • You will see a set of json files in format "full_box_{K}shot_{cls}_trainval.json" as well as 9 folders named "seed{i}"

  • Then you can run the command line in run/finetune_coco_TFA.sh to finish few-shot fine-tuning and testing

MS-COCO

We get the following results:

AP (Base) AP@50 (Base) AP@75 (Base) AP (Novel) AP@50 (Novel) AP@75 (Novel)
K=1 2.4 7.0 1.0 3.2 8.9 1.4
K=2 4.4 11.9 2.2 4.9 13.3 2.3
K=3 4.9 13.6 2.2 6.7 18.6 2.9
K=5 7.0 17.5 4.4 8.1 20.1 4.4
K=10 9.0 21.2 6.1 10.7 25.6 6.5
K=30 12.3 26.4 10.2 15.9 31.7 15.1

Note: the difference between the performance of multiple runs and the performance of this specific split can be explained by the different sample configuration in the few-shot fine-tuning stage. As we follow the strategy proposed in MetaR-CNN to sample 3K instances of base class and K instances of novel class, without any specific adjustment, the performance would naturally degrade when we simply use the split of TFA where only K instances are considered for each base class.

fewshotdetection's People

Contributors

youngxiao13 avatar ze-yang avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

fewshotdetection's Issues

conda create failed

Hello.
An error pops up during the installation of the environment:

`CondaHTTPError: HTTP 404 NOT FOUND for url https://conda.anaconda.org/anaconda/linux-64/cudnn-6.0.21-cuda8.0_0.tar.bz2
Elapsed: 00:00.271930
CF-RAY: 5c543dfaba149ab6-FRA

An HTTP error occurred when trying to retrieve this URL.
HTTP errors are often intermittent, and a simple retry will get you on your way.

CondaHTTPError: HTTP 404 NOT FOUND for url https://conda.anaconda.org/anaconda/linux-64/backports.weakref-1.0rc1-py36_0.tar.bz2
Elapsed: 00:00.198275
CF-RAY: 5c543fb34d171ea5-AMS

An HTTP error occurred when trying to retrieve this URL.
HTTP errors are often intermittent, and a simple retry will get you on your way.

CondaHTTPError: HTTP 404 NOT FOUND for url https://conda.anaconda.org/anaconda/linux-64/html5lib-0.9999999-py36_0.tar.bz2
Elapsed: 00:00.270236
CF-RAY: 5c543fc4fe45c83f-AMS

An HTTP error occurred when trying to retrieve this URL.
HTTP errors are often intermittent, and a simple retry will get you on your way.

CondaHTTPError: HTTP 404 NOT FOUND for url https://conda.anaconda.org/anaconda/linux-64/bleach-1.5.0-py36_0.tar.bz2
Elapsed: 00:00.325310
CF-RAY: 5c543fd2bad4178a-FRA

An HTTP error occurred when trying to retrieve this URL.
HTTP errors are often intermittent, and a simple retry will get you on your way.

CondaHTTPError: HTTP 404 NOT FOUND for url https://conda.anaconda.org/anaconda/linux-64/tensorflow-gpu-base-1.3.0-py36cuda8.0cudnn6.0_1.tar.bz2
Elapsed: 00:00.206814
CF-RAY: 5c5440460ab8bdaf-AMS

An HTTP error occurred when trying to retrieve this URL.
HTTP errors are often intermittent, and a simple retry will get you on your way.

CondaHTTPError: HTTP 404 NOT FOUND for url https://conda.anaconda.org/anaconda/linux-64/tensorflow-gpu-1.3.0-0.tar.bz2
Elapsed: 00:00.404021
CF-RAY: 5c5440487e160625-FRA

An HTTP error occurred when trying to retrieve this URL.
HTTP errors are often intermittent, and a simple retry will get you on your way.`

Where can I find the required versions of the libraries?

Thank you

Evaluating COCO base

Hi,

I am trying to evaluate the base checkpoint for COCO to see AP results for each class before few-shot training.

  • With --meta_test it needs the saved mean_class_attentions.pkl from training, which I can extract but it seems to be an error with the provided checkpoint. Before saving it, I printed out the keys in the dict for the classes and there were only 3 of them:
    (I've extracted these from scratch deleting all .pt or .pkl files that were generated)
after class filtering, there are 197098 images...

197098 roidb entries
Loading pretrained weights from data/pretrained_model/resnet101_caffe.pth
loading checkpoint save_models/COCO/coco_metarcnn_200_20.pth
loaded checkpoint save_models/COCO/coco_metarcnn_200_20.pth
dict_keys([11, 23, 17])

with proposed new checkpoint for COCO [Dec-15 2020] :

after class filtering, there are 197098 images...

197098 roidb entries
Loading pretrained weights from data/pretrained_model/resnet101_caffe.pth
loading checkpoint save_models/COCO/coco_metarcnn_200_20.pth
loaded checkpoint save_models/COCO/coco_metarcnn_200_20.pth
dict_keys([13])
  • Without --meta_test flag, meaning that no mean_class_attentions.pkl is required it also fails:
RuntimeError: size mismatch, m1: [300 x 2048], m2: [4096 x 324] at /opt/conda/conda-bld/pytorch_1524584710464/work/aten/src/THC/generic/THCTensorMathBlas.cu:249

Data seems fine and in place as expected.
@YoungXIAO13 Could you help with this issue? Can you verify that COCO checkpoint is correct? Do you have checkpoints that are already trained with few-shot part?

@hjraad

custom dataset

@YoungXIAO13 Any clue how to apply this on a custom dataset? How about adding this to you readme, this can be very useful!

training does not proceed post dataset initialization

Thanks for your amazing paper. But looks like the training process gets stuck after the dataset initialization stage.
Post this step specifically:
Loaded dataset voc_2007_train_first_split for training
Set proposal method: gt
Appending horizontally-flipped training examples...
wrote gt roidb to /home/anay/amajee/fsod_additional_features/data/cache/voc_2007_train_first_split_gt_roidb.pkl
done
Preparing training data...
done
Loaded dataset voc_2012_train_first_split for training
Set proposal method: gt
Appending horizontally-flipped training examples...
wrote gt roidb to /home/anay/amajee/fsod_additional_features/data/cache/voc_2012_train_first_split_gt_roidb.pkl
done
Preparing training data...
done
before filtering, there are 12330 images...
after filtering, there are 12330 images...

before class filtering, there are 12330 images...
after class filtering, there are 12330 images...

12330 roidb entries
Loading pretrained weights from data/pretrained_model/resnet101_caffe.pth
!!!! POST THIS there is no output for quite a long time

I am using the same environment as the repository with pytorch=0.4.0 and CUDA 8.0. Please advise on the same.

ValueError: too many values to unpack (expected 3)

Hi, thanks for your work.
I implemented base training myself, and then fine-tune.
I have run bash run/test_voc_first.sh and got this error randomly.
ValueError: too many values to unpack (expected 3)
The error happens in imdb.evaluate_detections(all_boxes, output_dir, **vars(args)).
How to solve it ?
Thanks!

How do you ensure fair data usage?

You created 2 datasets. One is meta dataset (support dataset). The other one is training dataset. Both datasets have been filtered to contain only K number of training samples for novel categories. (K is from K-shot)

However, I don't find the codes to ensure that the samples from meta dataset and training dataset are exactly the same. If both datasets are mutually exclusive, then the total number of training samples used shall be 2x the correct number.

Please kindly correct me if I am missing something here. Thank you very much! @YoungXIAO13

No module named 'model.utils.cython_bbox'

Hi,

I followed the instructions (I think) and as I try running: bash run/finetune_voc_first.sh, I get the following error trace:

Traceback (most recent call last):
File "train.py", line 20, in
from roi_data_layer.roidb import combined_roidb, rank_roidb_ratio, filter_class_roidb_flip, clean_roidb
File "/home/lab/FSDetView/lib/roi_data_layer/roidb.py", line 9, in
from datasets.factory import get_imdb
File "/home/lab/FSDetView/lib/datasets/factory.py", line 15, in
from datasets.coco import coco
File "/home/lab/FSDetView/lib/datasets/coco.py", line 10, in
from datasets.imdb import imdb
File "/home/lab/FSDetView/lib/datasets/imdb.py", line 14, in
from model.utils.cython_bbox import bbox_overlaps
ModuleNotFoundError: No module named 'model.utils.cython_bbox'

It might be worth noting as well that the previous error was complaining that data/cache doesn't exist, which I circumvented by commenting out the os.list("data/cache") line in _init_paths.py.

What is the idea behind breaking if classes[cls] >= self.shots in metadata creation?

In earlier versions of metadata_coco.py, the loop over annotations of an image would always break if we find an annotation for a category we already sampled enough annotations for:

if classes[cls] >= self.shots:
break

Now, we only break in this case if we are in phase 1 and continue sampling of annotations for phase 2:
if classes[cls] >= self.shots:
if self.phase == 2:
continue
else:
break

What is generally the idea of breaking instead of continuing in this case?
Why having phase-dependent actions in the current version and why don't always continue the loop and thus look if there are some annotations of categories we haven't yet sampled enough annotations for?

Some packages in spec-file.txt don't exist

CondaHTTPError: HTTP 404 NOT FOUND for url https://conda.anaconda.org/anaconda/linux-64/backports.weakref-1.0rc1-py36_0.tar.bz2
Elapsed: 00:00.636837
CF-RAY: 5ba4566f1fe00ba4-HKG

An HTTP error occurred when trying to retrieve this URL.
HTTP errors are often intermittent, and a simple retry will get you on your way.

CondaHTTPError: HTTP 404 NOT FOUND for url https://conda.anaconda.org/anaconda/linux-64/html5lib-0.9999999-py36_0.tar.bz2
Elapsed: 00:00.521513
CF-RAY: 5ba45672bd1d1995-HKG

An HTTP error occurred when trying to retrieve this URL.
HTTP errors are often intermittent, and a simple retry will get you on your way.

CondaHTTPError: HTTP 404 NOT FOUND for url https://conda.anaconda.org/anaconda/linux-64/bleach-1.5.0-py36_0.tar.bz2
Elapsed: 00:00.504550
CF-RAY: 5ba4567608961923-HKG

An HTTP error occurred when trying to retrieve this URL.
HTTP errors are often intermittent, and a simple retry will get you on your way.

CondaHTTPError: HTTP 404 NOT FOUND for url https://conda.anaconda.org/anaconda/linux-64/tensorflow-gpu-base-1.3.0-py36cuda8.0cudnn6.0_1.tar.bz2
Elapsed: 00:00.522496
CF-RAY: 5ba4567938783291-HKG

An HTTP error occurred when trying to retrieve this URL.
HTTP errors are often intermittent, and a simple retry will get you on your way.

CondaHTTPError: HTTP 404 NOT FOUND for url https://conda.anaconda.org/anaconda/linux-64/tensorflow-gpu-1.3.0-0.tar.bz2
Elapsed: 00:00.570590
CF-RAY: 5ba4567c7cda0bcc-HKG

An HTTP error occurred when trying to retrieve this URL.
HTTP errors are often intermittent, and a simple retry will get you on your way.

Some questions about "Class Data" ?

Thanks for your work, however I have a question: in each iteration, do you sample 15 different objects for 15 base categories as the data of support set? In other words, should we sample 60 different objects for 60 base categories as the data of support set? I don not have the GPU with enough memory. Can you give me some suggestions?

TypeError: 'NoneType' object is not iterable

@YoungXIAO13 ,hello,when I use 'bash run/train_voc_first.sh' to try to run 'train.py', I have the following problems:
When training to a certain round, will suddenly prompt this error. I don't know how to solve it. Can you help me to have a look?

.........
[session 1][epoch  4][iter 1100] loss: 0.4251, lr: 1.00e-03
                        fg/bg=(117/395), time cost: 82.185346
                        rpn_cls: 0.0298, rpn_box: 0.0210, rcnn_cls: 0.1314, rcnn_box 0.1463, meta_loss 0.0307
THCudaCheck FAIL file=/opt/conda/conda-bld/pytorch_1524580978845/work/aten/src/THC/generic/THCTensorCopy.c line=21 error=4 : unspecified launch failure
Traceback (most recent call last):
  File "train.py", line 446, in <module>
    num_boxes_list)
  File "/home/zhangwei/anaconda3/envs/sfsmtl35/lib/python3.5/site-packages/torch/nn/modules/module.py", line 491, in __call__
    result = self.forward(*input, **kwargs)
  File "/media/hdd/shenfanshu/DAFSdetection/FewShotDetection-master/lib/model/faster_rcnn/faster_rcnn.py", line 77, in forward
    rois, rpn_loss_cls, rpn_loss_bbox = self.RCNN_rpn(base_feat, im_info, gt_boxes, num_boxes)
  File "/home/zhangwei/anaconda3/envs/sfsmtl35/lib/python3.5/site-packages/torch/nn/modules/module.py", line 491, in __call__
    result = self.forward(*input, **kwargs)
  File "/media/hdd/shenfanshu/DAFSdetection/FewShotDetection-master/lib/model/rpn/rpn.py", line 78, in forward
    im_info, cfg_key))
  File "/home/zhangwei/anaconda3/envs/sfsmtl35/lib/python3.5/site-packages/torch/nn/modules/module.py", line 491, in __call__
    result = self.forward(*input, **kwargs)
  File "/media/hdd/shenfanshu/DAFSdetection/FewShotDetection-master/lib/model/rpn/proposal_layer.py", line 85, in forward
    shifts = shifts.contiguous().type_as(scores).float()
RuntimeError: cuda runtime error (4) : unspecified launch failure at /opt/conda/conda-bld/pytorch_1524580978845/work/aten/src/THC/generic/THCTensorCopy.c:21
Exception ignored in: <bound method _DataLoaderIter.__del__ of <torch.utils.data.dataloader._DataLoaderIter object at 0x7ffb6c0d8208>>
Traceback (most recent call last):
  File "/home/zhangwei/anaconda3/envs/sfsmtl35/lib/python3.5/site-packages/torch/utils/data/dataloader.py", line 349, in __del__
  File "/home/zhangwei/anaconda3/envs/sfsmtl35/lib/python3.5/site-packages/torch/utils/data/dataloader.py", line 328, in _shutdown_workers
  File "/home/zhangwei/anaconda3/envs/sfsmtl35/lib/python3.5/multiprocessing/queues.py", line 337, in get
  File "<frozen importlib._bootstrap>", line 968, in _find_and_load
  File "<frozen importlib._bootstrap>", line 953, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 887, in _find_spec
TypeError: 'NoneType' object is not iterable

error on bash run/test_voc_first.sh

i got these error when using bash run/test_voc_first.sh

(FSdetection) jsi@jsi-GS75-Stealth-9SE:~/PycharmProjects/FewShotDetection$ bash run/test_voc_first.sh 
Traceback (most recent call last):
  File "test.py", line 22, in <module>
    from model.utils.net_utils import save_net, load_net, vis_detections, vis_detections_label_only
  File "/home/jsi/PycharmProjects/FewShotDetection/lib/model/utils/net_utils.py", line 8, in <module>
    from model.roi_crop.functions.roi_crop import RoICropFunction
  File "/home/jsi/PycharmProjects/FewShotDetection/lib/model/roi_crop/functions/roi_crop.py", line 4, in <module>
    from .._ext import roi_crop
  File "/home/jsi/PycharmProjects/FewShotDetection/lib/model/roi_crop/_ext/roi_crop/__init__.py", line 3, in <module>
    from ._roi_crop import lib as _lib, ffi as _ffi
ImportError: /home/jsi/PycharmProjects/FewShotDetection/lib/model/roi_crop/_ext/roi_crop/_roi_crop.so: undefined symbol: __cudaRegisterFatBinaryEnd
Traceback (most recent call last):
  File "test.py", line 22, in <module>
    from model.utils.net_utils import save_net, load_net, vis_detections, vis_detections_label_only
  File "/home/jsi/PycharmProjects/FewShotDetection/lib/model/utils/net_utils.py", line 8, in <module>
    from model.roi_crop.functions.roi_crop import RoICropFunction
  File "/home/jsi/PycharmProjects/FewShotDetection/lib/model/roi_crop/functions/roi_crop.py", line 4, in <module>
    from .._ext import roi_crop
  File "/home/jsi/PycharmProjects/FewShotDetection/lib/model/roi_crop/_ext/roi_crop/__init__.py", line 3, in <module>
    from ._roi_crop import lib as _lib, ffi as _ffi
ImportError: /home/jsi/PycharmProjects/FewShotDetection/lib/model/roi_crop/_ext/roi_crop/_roi_crop.so: undefined symbol: __cudaRegisterFatBinaryEnd
Traceback (most recent call last):
  File "test.py", line 22, in <module>
    from model.utils.net_utils import save_net, load_net, vis_detections, vis_detections_label_only
  File "/home/jsi/PycharmProjects/FewShotDetection/lib/model/utils/net_utils.py", line 8, in <module>
    from model.roi_crop.functions.roi_crop import RoICropFunction
  File "/home/jsi/PycharmProjects/FewShotDetection/lib/model/roi_crop/functions/roi_crop.py", line 4, in <module>
    from .._ext import roi_crop
  File "/home/jsi/PycharmProjects/FewShotDetection/lib/model/roi_crop/_ext/roi_crop/__init__.py", line 3, in <module>
    from ._roi_crop import lib as _lib, ffi as _ffi
ImportError: /home/jsi/PycharmProjects/FewShotDetection/lib/model/roi_crop/_ext/roi_crop/_roi_crop.so: undefined symbol: __cudaRegisterFatBinaryEnd
Traceback (most recent call last):
  File "test.py", line 22, in <module>
    from model.utils.net_utils import save_net, load_net, vis_detections, vis_detections_label_only
  File "/home/jsi/PycharmProjects/FewShotDetection/lib/model/utils/net_utils.py", line 8, in <module>
    from model.roi_crop.functions.roi_crop import RoICropFunction
  File "/home/jsi/PycharmProjects/FewShotDetection/lib/model/roi_crop/functions/roi_crop.py", line 4, in <module>
    from .._ext import roi_crop
  File "/home/jsi/PycharmProjects/FewShotDetection/lib/model/roi_crop/_ext/roi_crop/__init__.py", line 3, in <module>
    from ._roi_crop import lib as _lib, ffi as _ffi
ImportError: /home/jsi/PycharmProjects/FewShotDetection/lib/model/roi_crop/_ext/roi_crop/_roi_crop.so: undefined symbol: __cudaRegisterFatBinaryEnd
Traceback (most recent call last):
  File "test.py", line 22, in <module>
    from model.utils.net_utils import save_net, load_net, vis_detections, vis_detections_label_only
  File "/home/jsi/PycharmProjects/FewShotDetection/lib/model/utils/net_utils.py", line 8, in <module>
    from model.roi_crop.functions.roi_crop import RoICropFunction
  File "/home/jsi/PycharmProjects/FewShotDetection/lib/model/roi_crop/functions/roi_crop.py", line 4, in <module>
    from .._ext import roi_crop
  File "/home/jsi/PycharmProjects/FewShotDetection/lib/model/roi_crop/_ext/roi_crop/__init__.py", line 3, in <module>
    from ._roi_crop import lib as _lib, ffi as _ffi
ImportError: /home/jsi/PycharmProjects/FewShotDetection/lib/model/roi_crop/_ext/roi_crop/_roi_crop.so: undefined symbol: __cudaRegisterFatBinaryEnd

here is my enviroment, which is pytorch 0.4.0 and cuda8.0. don't know whats going wrong:

(FSdetection) jsi@jsi-GS75-Stealth-9SE:~/PycharmProjects/FewShotDetection$ conda list
# packages in environment at /home/jsi/anaconda3/envs/FSdetection:
#
# Name                    Version                   Build  Channel
_libgcc_mutex             0.1                        main    defaults
blas                      1.0                         mkl    defaults
ca-certificates           2020.12.8            h06a4308_0    defaults
certifi                   2020.12.5        py36h06a4308_0    defaults
cffi                      1.14.4           py36h261ae71_0    defaults
cuda80                    1.0                  h205658b_0    pytorch
cudatoolkit               8.0                           3    defaults
cython                    0.29.21          py36h2531618_0    defaults
easydict                  1.9                      pypi_0    pypi
freetype                  2.10.4               h5ab3b9f_0    defaults
intel-openmp              2020.2                      254    defaults
jpeg                      9b                   h024ee3a_2    defaults
lcms2                     2.11                 h396b838_0    defaults
libffi                    3.3                  he6710b0_2    defaults
libgcc-ng                 9.1.0                hdf63c60_0    defaults
libpng                    1.6.37               hbc83047_0    defaults
libstdcxx-ng              9.1.0                hdf63c60_0    defaults
libtiff                   4.1.0                h2733197_1    defaults
lz4-c                     1.9.2                heb0550a_3    defaults
mkl                       2020.2                      256    defaults
mkl-service               2.3.0            py36he8ac12f_0    defaults
mkl_fft                   1.2.0            py36h23d657b_0    defaults
mkl_random                1.1.1            py36h0573a6f_0    defaults
ninja                     1.10.2           py36hff7bd54_0    defaults
numpy                     1.19.2           py36h54aff64_0    defaults
numpy-base                1.19.2           py36hfa32c7d_0    defaults
olefile                   0.46                     py36_0    defaults
openssl                   1.0.2u               h7b6447c_0    defaults
pandas                    0.25.3                   pypi_0    pypi
pillow                    8.1.0            py36he98fc37_0    defaults
pip                       20.3.3           py36h06a4308_0    defaults
pycparser                 2.20                       py_2    defaults
python                    3.6.0                         0    defaults
pytorch                   0.4.0           py36_cuda8.0.61_cudnn7.1.2_1    pytorch
pytz                      2020.5                   pypi_0    pypi
readline                  6.2                           2    defaults
scipy                     1.2.0                    pypi_0    pypi
setuptools                51.1.2           py36h06a4308_3    defaults
six                       1.15.0           py36h06a4308_0    defaults
sqlite                    3.13.0                        0    defaults
tk                        8.5.18                        0    defaults
torchvision               0.2.1                    py36_0    defaults
wheel                     0.36.2             pyhd3eb1b0_0    defaults
xz                        5.2.5                h7b6447c_0    defaults
zlib                      1.2.11               h7b6447c_3    defaults
zstd                      1.4.5                h9ceee32_0    defaults

Small data set

I'm going to train a object detection model, But my training set only has 50 pictures,Can you give me some good advice,

so: undefined symbol: __cudaPopCallConfiguration

I met question about so: undefined symbol: __cudaPopCallConfiguration. I found it means the wrong match about cuda and pytorch.
My environment is CUDA8.0 Python3.6 pytorch0.4.0 gcc5.5 ,also I tired CUDA9.0 Python3.6 pytorch0.4.0 gcc5.5.
Thanks for anyone who can help me.

IN test.py vis result

im = cv2.imread(imdb.image_path_from_index(int(data[4]))

should Convert To:

im = cv2.imread(imdb.image_path_from_index(imdb.image_index[i]))

About metadata.py.

  1. in metadata.py, if phase==2, shots=shots3. I don't understand why shots3. Is this will cause an unfair performance comparison?

  2. about metaclass, when finetuning, why specify all_classes_first/second/third, is these three the same? Does the order matter?

Some problems about compiling environment

Hello! I have some problems when running the program. I suspect that it is caused by a problem with my configuration environment, so I want to ask about your configuration environment, including the version of ubuntu, the version of cuda and the model of the graphics card

How is AP(novel) > AP(base) on COCO?

Correct me if I am mistaken, but the reported AP values for base and novel might be swapped.
With the model only seeing 1/3 novel samples compared to the base sample, we expect the detection accuracy to also be lower accordingly. The reported accuracies for VOC seem to abide by this estimation. What is different in COCO for the results to be skewed the other way?

Bug with mask in metadata creation?

Probably, there is a bug with the mask variable in metadata creation. The variable is created in the loop over all images, but is modified in the loop over the annotations of an image. When sampling more than one annotation per image, the mask is not reset back to 0s but instead still holds the 1s from the bbox of the previous annotation. The next annotation then sets 1s for its bbox which could result in a wrong mask for that annotation, since old 1s are still present.

Originally, there was a limit to sample at most one annotation per image


which would prevent that inconsistency of mask variable, since it would be reset back to 0s before processing annotations of a new image.

But since commit 838f471, we allow for sampling multiple annotations per image which could lead to the inconsistency described above. Curently, we have a phase-dependent strategy,

if self.phase == 1:
break

which could still lead to problems for phase 2.

I'm not sure if we should reset the masks values back to zero mask[y1:y2, x1:x2] = 0 or to completely redefine the variable mask = np.zeros((self.img_size, self.img_size), dtype=np.float32), directly after

prn_mask[cls].append(mask)

ValueError: bg_num_rois = 0 and fg_num_rois = 0, this should not happen!

I tried to run the code on Pascal VOC 2007 only, with resnet-50, and it worked.
Then I tried to run the code on Pascal VOC 2007+2012, with resnet-101, this bug appeared.
The environment I used is
CUDA 9.0
Python=3.7
PyTorch=0.4.1.post2
torchvision=0.2.1.post2
in colab

I tried several methods, such as

delete -1 in psacal_voc.py

    # Load object bounding boxes into a data frame.
    for ix, obj in enumerate(objs):
        bbox = obj.find('bndbox')
        # Make pixel indexes 0-based
        x1 = float(bbox.find('xmin').text) #- 1
        y1 = float(bbox.find('ymin').text) #- 1
        x2 = float(bbox.find('xmax').text) #- 1
        y2 = float(bbox.find('ymax').text) #- 1

and

delete -1 in imdb.py

for i in range(num_images):
  boxes = self.roidb[i]['boxes'].copy()
  oldx1 = boxes[:, 0].copy()
  oldx2 = boxes[:, 2].copy()
  boxes[:, 0] = widths[i] - oldx2# - 1
  boxes[:, 2] = widths[i] - oldx1# - 1

and

I changed the TRAIN.RPN_MIN_SIZE = 8 to 0

I've tried all the methods mentioned in jwyang/faster-rcnn.pytorch#111 but it didn't work
Could you tell me how to fix the bug?

Called with args:
Namespace(TFA=False, batch_size=4, checkepoch=10, checkpoint=21985, checkpoint_interval=10000, checksession=1, class_agnostic=False, cuda=True, dataset='pascal_voc_0712', disp_interval=100, fix_encoder=False, log_dir='checkpoint', lr=0.001, lr_decay_gamma=0.1, lr_decay_step=4, max_epochs=21, meta_loss=True, meta_train=True, meta_type=1, net='metarcnn', num_workers=1, optimizer='sgd', phase=1, resume=False, save_dir='save_models/VOC_first', session=1, shots=1, start_epoch=1, use_tfboard=True)
Loaded dataset voc_2007_train_first_split for training
Set proposal method: gt
Appending horizontally-flipped training examples...
wrote gt roidb to /content/drive/MyDrive/few shot/data/cache/voc_2007_train_first_split_gt_roidb.pkl
done
Preparing training data...
done
Loaded dataset voc_2012_train_first_split for training
Set proposal method: gt
Appending horizontally-flipped training examples...
wrote gt roidb to /content/drive/MyDrive/few shot/data/cache/voc_2012_train_first_split_gt_roidb.pkl
done
Preparing training data...
done
before filtering, there are 12330 images...
after filtering, there are 12330 images...

before class filtering, there are 12330 images...
after class filtering, there are 12330 images...

12330 roidb entries
Loading pretrained weights from data/resnet101.pth
[session 1][epoch 1][iter 0] loss: 17.7643, lr: 1.00e-03
fg/bg=(38/474), time cost: 1.222608
rpn_cls: 0.8498, rpn_box: 0.4404, rcnn_cls: 15.8508, rcnn_box 0.4173, meta_loss 0.2060
Traceback (most recent call last):
File "train.py", line 477, in
rois_label, cls_prob, bbox_pred, meta_loss = fasterRCNN(im_data_list, im_info_list, gt_boxes_list, num_boxes_list)
File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 477, in call
result = self.forward(*input, **kwargs)
File "/content/drive/MyDrive/few shot/lib/model/faster_rcnn/faster_rcnn.py", line 84, in forward
roi_data = self.RCNN_proposal_target(rois, gt_boxes, num_boxes)
File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 477, in call
result = self.forward(*input, **kwargs)
File "/content/drive/MyDrive/few shot/lib/model/rpn/proposal_target_layer_cascade.py", line 47, in forward
rois_per_image, self._num_classes)
File "/content/drive/MyDrive/few shot/lib/model/rpn/proposal_target_layer_cascade.py", line 202, in _sample_rois_pytorch
raise ValueError("bg_num_rois = 0 and fg_num_rois = 0, this should not happen!")
ValueError: bg_num_rois = 0 and fg_num_rois = 0, this should not happen!

train custom datasets show error!

There is such an error when training your own data set, but there is no problem in checking my own data:
Traceback (most recent call last):
File "train.py", line 292, in
roidb = filter_class_roidb_flip(roidb, 0, imdb, base_num)
File "/home/map/lihui44/FewShotDetection/lib/roi_data_layer/roidb.py", line 162, in filter_class_roidb_flip
assert (boxes[:, 2] >= boxes[:, 0]).all()

If I comment out assert, another error will appear, Do you know the reason?

Base training accuracy(phase 1)

I was trying to generate the base class accuracy numbers post phase 1 training on VOC dataset.
Unfortunately, I get this error. Is there a way to get the accuracy numbers from phase 1 training.
lib/model/faster_rcnn/faster_rcnn.py:210: UserWarning: Implicit dimension choice for softmax has been deprecated. Change the call to include dim=X as an argument.
cls_prob = F.softmax(cls_score)
Traceback (most recent call last):
File "test.py", line 344, in
inds = torch.nonzero(scores[:, j] > thresh).view(-1)
IndexError: index 16 is out of bounds for dimension 1 with size 16

Cannot replicate the result after pretraining on COCO

Thanks for your outstanding work and remarkable results!
However, after several attempts on different random seeds, we cannot replicate the result (10 shot mAP 12.5 and 30 shot mAP 14.7) after pretraining on COCO by ourselves. To our surprise, the result seems just fine when we use your pretrained model on COCO.
Also, we find that you only use 20000+ base only images in pretraining phase, rather than images with base and novel ground truths. Is this for not recognizing novel classes as background in RPN? Could you maybe explain this?

Can not reproduce results on VOC dataset after base training.

Hi, thanks for your work.
I tried to reproduce the results on VOC2007 test dataset, but got confused.
First, I tried to finetune the model provided, which is base trained. I got similar results as reported.
Second, I tried to train from base stage to fine tune stage and I got much worse results, mAP nearly 0. I tried 3 times parallel training, none of the attempt worked well. (first split) All steps were following the README file.
Can you help explain this problem? Thanks.

How to construct 200 shots for coco training in phase I?

Hi,

I am wondering how to construct 200 shots for each class as meta dataset for coco. As the code shows, you only select meta dataset from train2014 only rather than train2014+valminusminival2014. I find that in this way, you are only able to get 44 hair drier and 37 toaster. Even when extend to train2014+valminusminival2014, there are only 198 hair drier available. Could you advise? I am not sure whether I have missed something. Any help is appreciated. Thanks.

bash run/train_voc_first.sh

2021-04-22 15-21-27屏幕截图
When I run this step bash run/train_voc_first.sh, it appears as shown in the figure. It has been in this state for a day. I don't know how to solve it. Can you help me?Thank you.

Report data leakage that causes unfair few-shot setting

As I mentioned in yanxp/MetaR-CNN#36, the released code of Meta R-CNN exists a bug in shot sampling.

Take coco 10shot for example, you will first construct a meta-data set encomprising 30 (3x shots) (prn_image, prn_mask) pairs for each class. The image indexes of these sampled images are saved in a file named annotations/instances_shot2014.json.

After that, roidb needs to be contructed to provide query samples for finetuning purpose. According to the definition of few-shot setting, you can only access to the N-shot (N instances per class) data no matter whether you perform finetuning or not. Therefore, the roidb dataset should contain the same instances as in meta-data set, otherwise it will exceed the designated number of shots. However, as I find in your code, you does not save the anno_index of the selected instances in meta-data set. Instead, you again randomly sample the shot instances from the images list indicated in annotations/instances_shot2014.json. In this case, I am concern about how you guarantee that the newly sampled instances are exactly the same as the ones in meta-data set.

Hope that you can check about this issue and clarify my concern. Thanks a lot.

Training on Custom DataSet

Hi @YoungXIAO13

I want to train a model with my own custom data set prepared with 10 classes and 5 of which are novel

I see most of the train.py having setup to standard dataset like coco , pascal_voc_0712

can you please suggest how to move forward , How should i prepare the dataset and how can i make your code work with custom taining

Please help with this

The aggregator question

Dear author, I have a question, why subtraction is used in aggregation, or in other words, why subtraction can represent similarity. Is this similar to Euclidean distance? The more similar object will be close to them? Please give us your explanation. Thank you!

3 times few-shot training instances?

In meta dataset for VOC:
https://github.com/YoungXIAO13/FewShotDetection/blob/master/lib/datasets/metadata.py#L39
and for COCO:
https://github.com/YoungXIAO13/FewShotDetection/blob/master/lib/datasets/metadata_coco.py#L43

Variable shots are multiplied by 3:

if phase == 2:
    self.shots = shots * 3

where shots is the parameter passed by the command for training and is 10 or 30 for COCO:
https://github.com/YoungXIAO13/FewShotDetection/blob/master/run/finetune_coco.sh#L6

and the variable self.shots are used as the number of instances for each class in function get_prndata:
https://github.com/YoungXIAO13/FewShotDetection/blob/master/lib/datasets/metadata_coco.py#L234

Thus, in phase 2, you have 3 times instances compared with FSRW & TFA. Did I get it wrong?

Question about train and test

Thanks for your great work!

I have some questions about the training and test setting. When training, no matter which class attention vector combined with the query feature, the groundtruth are same. However, the class score are connected with the attention vector when testing the performance and the score for background is from the first attention vector. The loss function did not build the connection between the predict class and the class of attention vector.

Is this strange?

Requirements

@YoungXIAO13 Do we need to have exactly the versions from the following list?

  • CUDA 8.0
  • Python=3.6
  • PyTorch=0.4.0
  • torchvision=0.2.1
  • gcc >= 4.9

I am using:

  • CUDA 10.0
  • Python=3.6
  • PyTorch=1.5.1
  • torchvision=0.6.1
  • gcc = 7.5

coco training time

hi, i use your script run/train_coco.sh to perform base training on coco dataset. But the epoch is set to be 21, batch_size as 4, which makes training time very large, since coco is a large dataset. I am wondering about this setting.

COCO training missing annotation files?

Hi,

congratulations on the nice paper and repository! I am trying to reproduce the results on COCO with @hjraad, but I've ran into two problems:

Problem 1) Seems I don't have 2 annotations files. I've tried to run train_coco.sh when got this error on two missing files:

FileNotFoundError: [Errno 2] No such file or directory: 
 /FewShotDetection/data/coco/annotations/instances_valminusminival2014.json'
FileNotFoundError: [Errno 2] No such file or directory: '/FewShotDetection/data/coco/annotations/instances_minival2014.json'

I've downloaded COCO following the README.md under data, I obtained these files from the annotations:

instances_train2014.json
captions_train2014.json
instances_val2014.json
captions_val2014.json
person_keypoints_train2014.json
instances_shots2014.json
person_keypoints_val2014.json

How should I get the missing files? Do I miss anything? Thanks in advance

Problem 2) I tried to run finetune_coco.sh as well with the provided base checkpoint, but it also got some problems with loading the data. Somehow it only loads 2 images wheres it should do 10 shot for multiple classes, which I guess should be more. Please see below:

Namespace(TFA=False, batch_size=4, checkepoch=20, checkpoint=21985, checkpoint_interval=10000, checksession=200, class_agnostic=False, cuda=True, dataset='coco', disp_interval=100, fix_encoder=False, log_dir='checkpoint', lr=0.001, lr_decay_gamma=0.1, lr_decay_step=4, max_epochs=30, meta_loss=True, meta_train=True, meta_type=0, net='metarcnn', num_workers=8, optimizer='sgd', phase=2, resume=True, save_dir='save_models/COCO', session=1, shots=10, start_epoch=1, use_tfboard=True)

[...]
Loaded dataset coco_2014_shots for training
[...]
before filtering, there are 2 images...
after filtering, there are 2 images...

Any clues on this? @YoungXIAO13

Thanks!

Question about experimental result

Thanks for a nice paper!
I have two question.

  1. In a paper, you mentioned 'Results averaged over multiple random runs'.
    I'd like to ask you which of sample do you used to calculate the 'averaged results'

  2. Is there experimental result with a same sample that is used by MetaYOLO(ICCV2019)

Wrong category indexing in coco evaluation (might be the cause for AP(novel)>AP(base))

I probably found an issue with the indexing of categories at coco evaluation. This might cause interchanged results for each category which would also affect AP(novel) and AP(base) since it is unknown which result belongs to which category #30 . The overall mAP is unaffected by this issue.

At inference, the test-class puts the detections inside the variable all_boxes where the classes are indexed by their position in imdb.classes (which is Background-Class + Base-Classes + Novel-Classes, as defined in the constructor of coco.py).

As coco.py does the evaluation, the following happens:

  • its method evaluate_detections gets the detections via all_boxes from the test class
  • evaluate_detections calls _write_coco_results_file
  • _write_coco_results_file iterates over its categories (self.classes) and uses the mapping coco_cat_id = self._class_to_coco_cat_id[cls] to obtain the category id from category name. It passes each category ID to _coco_results_one_category which will return detections in a different format:
    • x,y,w,h instead of xmin, ymin, xmax, ymax
    • image id instead of image index
    • category ID instead of category
  • Now we have saved the detections in a different format to a json file which will be passed to _do_detection_eval
  • _do_detection_eval creates an instance of COCOeval with
    • itself (the COCO object, initialized with the validation-annotation file)
    • another COCO-object initialized with the previously created json-file (the rewritten detections)
  • _do_detection_eval runs evaluate and accumulate on the cocoeval object and passes it to _print_detection_eval_metrics
  • inside COCOeval this takes place:
    • in its constructor, it sets self.params.catIds = sorted(cocoGt.getCatIds()), where cocoGt is the COCO-object initialized with the validation annotation file
    • evaluate() uses those catIDs to identify categories
    • accumulate() stores precision and recall of a category at the index of that category in catIDs (stores them in self.eval)

Now we have two problematic situations inside _print_detection_eval_metrics() method of coco.py:

  • printing of class-wise AP:
    • directly accesses cocoeval.eval['precision'] with class incides from cls_ind, cls in enumerate(self.classes), but as stated above, the metrics for a class are stored at the index of that class as in the validation annotation file. This causes the category names for per-category results to be interchanged
  • printing of summarized novel class mAP and base class mAP:
    • passes range(0, len(base_classes)) for summary of base classes and range(len(base_classes), len(all_classes)) for summary of novel classes to cocoeval.summarize. However, the summarize method uses the categoryId argument to directly access the precision and recall of that class, but those indices are wrong (as described above for class-wise AP)

To solve the stated problems, I would suggest the following changes (for _print_detection_eval_metrics, line 245-259)

cat_ids = self._COCO.getCatIds()
cats = self._COCO.loadCats(cat_ids)
cat_name_to_ind = dict(list(zip([c['name'] for c in cats], range(len(cats)))))
for cls_ind, cls in enumerate(cat_name_to_ind.keys()):
    # no check for cls == '__background__' needed due to new list we're iterating over
    precision = coco_eval.eval['precision'][ind_lo:(ind_hi + 1), :, cls_ind, 0, 2]  # no index shift necessary
    ap = np.mean(precision[precision > -1])
    print('{}: {:.1f}'.format(cls, 100 * ap))

print('~~~~ Summary Base metrics ~~~~')
categoryId = list(map(lambda cls: cat_name_to_ind[cls], self._base_classes))  # use correct indices now
coco_eval.summarize(categoryId)

print('~~~~ Summary Novel metrics ~~~~')
categoryId = list(map(lambda cls: cat_name_to_ind[cls], self._novel_classes))  # use correct indices now
coco_eval.summarize(categoryId)

self._base_classes and self._novel_classes are the lists of base and novel class names which I used to create self._classes in the constructor.

Some final thoughts on the issue:

  • I think using a variable name categoryId inside cocoeval.summarize is a bit confusing, since they treat them as indices
  • The crucial mistake presumably was the different order of the categories inside self.classes (of coco.py) and the categories as in the COCO object
    • In coco.py of Meta-RCNN they leave the categories in the same order as they are read in from the COCO API and just prepend a background class. That's why they are able to directly iterate over self.classes (instead of having to read in original coco categories for the correct order) and just have to do a simple index shift to obtain correct results for each class.

Issue about the class's attentions of every roi of image in faster_rcnn.py.

In line 122 of the faster_rcnn.py,
"proposal_labels = rois_label[b * 128:(b + 1) * 128].data.cpu().numpy()[0]"
The proposal_labels here only take the label of the first RoI, not all the labels of the RoI.

Then in line 123
"unique_labels = list(np.unique(proposal_labels))"

It does not seek unique labels for all RoIs of the input image, but only for the first RoI. Then all RoIs are operated channel-wise multiplication with the attention of the specific class, but the class picked out in line 123 is not for all RoIs.
I think this may be wrong, and it makes me very confused.

Hope you can give me some suggestions, thank you very much.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.