Giter Club home page Giter Club logo

fewx's People

Contributors

fanq15 avatar ze-yang avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

fewx's Issues

__init__() got an unexpected keyword argument 'first_stride'

Traceback (most recent call last):
File "mytest.py", line 253, in
model = init()
File "mytest.py", line 196, in init
predictor = DefaultPredictor(cfg)
File "/usr/local/lib/python3.7/dist-packages/detectron2/engine/defaults.py", line 216, in init
self.model = build_model(self.cfg)
File "/usr/local/lib/python3.7/dist-packages/detectron2/modeling/meta_arch/build.py", line 21, in build_model
model = META_ARCH_REGISTRY.get(meta_arch)(cfg)
File "mytest.py", line 43, in init
self.roi_heads = build_roi_heads(cfg, self.backbone.output_shape())
File "/content/drive/My Drive/FewX/fewx/modeling/fsod/fsod_roi_heads.py", line 44, in build_roi_heads
return ROI_HEADS_REGISTRY.get(name)(cfg, input_shape)
File "/content/drive/My Drive/FewX/fewx/modeling/fsod/fsod_roi_heads.py", line 75, in init
self.res5, out_channels = self._build_res5_block(cfg)
File "/content/drive/My Drive/FewX/fewx/modeling/fsod/fsod_roi_heads.py", line 102, in _build_res5_block
stride_in_1x1=stride_in_1x1,
File "/usr/local/lib/python3.7/dist-packages/detectron2/modeling/backbone/resnet.py", line 609, in make_stage
return ResNet.make_stage(*args, **kwargs)
File "/usr/local/lib/python3.7/dist-packages/detectron2/modeling/backbone/resnet.py", line 541, in make_stage
block_class(in_channels=in_channels, out_channels=out_channels, **curr_kwargs)
TypeError: init() got an unexpected keyword argument 'first_stride'

how to solve this problem?

the question of data

Thanks for your good work!

I want to use my own data,may be I should make the data format to the coco?

how to get 12 map?

I run all.sh and get nothing. so I use detetron demo predict code to generate the coco result and the resulte is to low.loading annotations into memory...
Done (t=0.50s)
creating index...
index created!
Loading and preparing results...
DONE (t=0.38s)
creating index...
index created!
Running per image evaluation...
Evaluate annotation type bbox
DONE (t=11.88s).
Accumulating evaluation results...
DONE (t=2.87s).
Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.011
Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.020
Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.010
Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.000
Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.006
Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.023
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.027
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.032
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.032
Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.000
Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.010
Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.071

Issues with num_gpus

With 4 GPUs, I rerun the defaults settings in all.sh and the AP is correct (11.27@voc 20 cats).

However, when I try to use another machines equiped with 2 GPUs, the loss_cls becomes strange and the AP at the end of training is near 0.

Hereby I provide my training log for debugging.
fsod_train_log.txt

In comparison with the logs on 2-GPUmachine and 4-GPU machine, the loss_cls diverges before the iteration 2999 which can be seen as bellows:

2 GPUs

�[32m[08/13 11:32:08 d2.utils.events]: �[0m eta: 2 days, 10:28:56  iter: 2999  total_loss: 0.935  loss_cls: 0.566  loss_box_reg: 0.255  loss_rpn_cls: 0.061  loss_rpn_loc: 0.015  time: 1.8013  data_time: 0.0312  lr: 0.004000  max_mem: 7442M

4 GPUs

�[32m[08/07 11:07:45 d2.utils.events]: �[0m eta: 1 day, 5:50:52  iter: 2999  total_loss: 0.811  loss_cls: 0.476  loss_box_reg: 0.226  loss_rpn_cls: 0.077  loss_rpn_loc: 0.019  time: 0.9198  data_time: 0.0164  lr: 0.004000  max_mem: 4173M

From your code, I do not see anything related to num-gpus. Maybe it is due to the lack of some extra code needed by Detectron2 if num-gpus changes?

RuntimeError

when I run "sh all,sh",there was an error:
RuntimeError: NCCL error in: /opt/conda/conda-bld/pytorch_1591914742272/work/torch/lib/c10d/ProcessGroupNCCL.cpp:32, unhandled cuda error, NCCL version 2.4.8

I try to train at 2 GPU and change the learn rate,I don't know where problem occured

No such file or directory: './support_dir/support_feature.pkl'

Thanks for your code! but I meet this error when I run your code on COCO dataset, could you please tell me what is this file for?and where could I get it?It seems that the file,train_support_df.pkl,and 10_shot_support_df.pkl organize the support data, so I'm confused about this error.

[08/18 14:15:37 d2.evaluation.evaluator]: Start inference on 5000 images
/home/ly/anaconda3/envs/py3_torch151/lib/python3.7/site-packages/setuptools/distutils_patch.py:26: UserWarning: Distutils was imported before Setuptools. This usage is discouraged and may exhibit undesirable behaviors or errors. Please use Setuptools' objects directly or at least import Setuptools first.
"Distutils was imported before Setuptools. This usage is discouraged "
Traceback (most recent call last):
File "fsod_train_net.py", line 118, in
args=(args,),
File "/home/ly/anaconda3/envs/py3_torch151/lib/python3.7/site-packages/detectron2/engine/launch.py", line 62, in launch
main_func(*args)
File "fsod_train_net.py", line 101, in main
res = Trainer.test(cfg, model)
File "/home/ly/anaconda3/envs/py3_torch151/lib/python3.7/site-packages/detectron2/engine/defaults.py", line 516, in test
results_i = inference_on_dataset(model, data_loader, evaluator)
File "/home/ly/anaconda3/envs/py3_torch151/lib/python3.7/site-packages/detectron2/evaluation/evaluator.py", line 141, in inference_on_dataset
outputs = model(inputs)
File "/home/ly/anaconda3/envs/py3_torch151/lib/python3.7/site-packages/torch/nn/modules/module.py", line 550, in call
result = self.forward(*input, **kwargs)
File "/home/ly/few-shot-object-detection/FewX-master/fewx/modeling/fsod/fsod_rcnn.py", line 126, in forward
self.init_model()
File "/home/ly/few-shot-object-detection/FewX-master/fewx/modeling/fsod/fsod_rcnn.py", line 302, in init_model
with open(support_file_name, 'wb') as f:

FileNotFoundError: [Errno 2] No such file or directory: './support_dir/support_feature.pkl'

NaN

Dear author, thanks for you great work. Currently I am trying to run your code but always report NaN error, the following is the error traceback, can you have a look? Thanks in advance!

detectron2/detectron2/modeling/proposal_generator/proposal_utils.py", line 103, in find_top_rpn_
proposals
    raise FloatingPointError(
FloatingPointError: Predicted boxes or scores contain Inf/NaN. Training has diverged.

Predict on new images

Hello,

Given the provided model (model_final.pth), how would I predict on new images?

Apply model to new dataset

Hi~

I was trying to apply your fsod model to new dataset. But encounter some problems when registering dataset. I have read some documents about Detectron2, still got confused. So, would you be so kind to explain the dataset regestering procedure?

Thanks a lot.

Difference between final_split_non_voc_instances_train2017.json and your log file

I follow the step 2 to generate final_split_non_voc_instances_train2017.json. The result dataset is slightly different to you training log. I am wondering whether I missed something? Thanks.

| category | #instances | category | #instances | category | #instances | |:-------------:|:-------------|:------------:|:-------------|:-------------:|:-------------| | person | 0 | bicycle | 0 | car | 0 | | motorcycle | 0 | airplane | 0 | bus | 0 | | train | 0 | truck | 4483 | boat | 0 | | traffic light | 1545 | fire hydrant | 1172 | stop sign | 1130 | | parking meter | 511 | bench | 3977 | bird | 0 | | cat | 0 | dog | 0 | horse | 0 | | sheep | 0 | cow | 0 | elephant | 4007 | | bear | 1241 | zebra | 4090 | giraffe | 4481 | | backpack | 2632 | umbrella | 3634 | handbag | 3029 | | tie | 2666 | suitcase | 3102 | frisbee | 1343 | | skis | 2570 | snowboard | 1296 | sports ball | 848 | | kite | 1327 | baseball bat | 996 | baseball gl.. | 749 | | skateboard | 2922 | surfboard | 3377 | tennis racket | 1693 | | bottle | 0 | wine glass | 2069 | cup | 6270 | | fork | 2498 | knife | 2743 | spoon | 2037 | | bowl | 5207 | banana | 3753 | apple | 2231 | | sandwich | 2972 | orange | 2765 | broccoli | 4347 | | carrot | 2964 | hot dog | 1829 | pizza | 3846 | | donut | 4160 | cake | 3422 | chair | 0 | | couch | 0 | potted plant | 0 | bed | 3005 | | dining table | 0 | toilet | 3464 | tv | 0 | | laptop | 2289 | mouse | 918 | remote | 1480 | | keyboard | 1351 | cell phone | 2602 | microwave | 670 | | oven | 1346 | toaster | 73 | sink | 2858 | | refrigerator | 1271 | book | 3823 | clock | 2682 | | vase | 2982 | scissors | 968 | teddy bear | 3386 | | hair drier | 119 | toothbrush | 783 | | | | total | 148004 | | | | |

About detectron2 dependency

Thanks for your release.
Would you mind specify a detectron2 (d2) commit number as d2 continues developing and some break-through changes may be made which would potentially influence the reproducibility.
Thanks in advance.

About trained model

Thanks for your inspiring work!
I have a question about the trained models your provided.
Based on my understanding, the two provided pth files in google drive are trained on COCO, a base one and a fine-tuned one. But where can I find the one trained on FSOD dataset(w/o fine-tune )? Appreciate for any clarifications! @fanq15

image

trainingg end with nan loss

Thanks for your code! When I try to run it on my dataset, the error output is:
File "FewX-master/fewx/data/dataset_mapper.py", line 234, in generate_support
support_id= self.support_df.loc[
(self.support_df['category_id'] == query_cls) &
(~self.support_df['image_id'].isin(used_image_id))
& (~self.support_df['id'].isin(used_id_ls))
, 'id'].sample(random_state=id).tolist()[0]

1)Is there anything wrong with generating pickle file?
and if i change the code to : support_id= self.support_df.loc[
(self.support_df['category_id'] == query_cls) , 'id'].sample(random_state=id).tolist()[0]
this error happens:
detectron2/detectron2/modeling/proposal_generator/proposal_utils.py", line 95, in find_top_rpn_proposals
"Predicted boxes or scores contain Inf/NaN. Training has diverged."
FloatingPointError: Predicted boxes or scores contain Inf/NaN. Training has diverged.
2) Could you give some suggestions?

how to detect my own picture

this project can not detect my own picture and my custum class, could you release the code ?meanshile, it cannot train your fsod data.

How to test?

Hi.
Training and evaluation were conduct using the model. I'd like to check the bounding box on the image through the test, so could you tell me what to do?

I'd really appreciate it if you'd let me know.

Why the mapper need to set random seed when select support images.

support_id = self.support_df.loc[(self.support_df['category_id'] == other_cls) & (~self.support_df['image_id'].isin(used_image_id)) & (~self.support_df['id'].isin(used_id_ls)), 'id'].sample(random_state=id).tolist()[0]

support_id = self.support_df.loc[(self.support_df['category_id'] == query_cls) & (~self.support_df['image_id'].isin(used_image_id)) & (~self.support_df['id'].isin(used_id_ls)), 'id'].sample(random_state=id).tolist()[0]

Query images

Hello,thank you for your work! I have a question. Do query images in your code come from coco val2017?

FSVOD code

Dear Qi Fan,

This is a great work on few shot learning task! But I found that it seems only FSOD for image object detection is implemented, do you have a plan to release the FSVOD code for video object detection, and what time is expected, thanks!

bounding box

How to visualize the bounding boxes with scores?

Custom Dataset

Nice work. How would you extend data prep and the rest of the pipeline for a custom dataset? assuming the same N-way K-shot (N,K) as CoCo or VOC are considered and the number of classes are the same too. Maybe good to add a section in the readme file on this one.

How much memory is needed for infer?

My graphics boards is gtx1660ti, memory 6G.
I run this code to report an error:
RuntimeError: CUDA out of memory. Tried to allocate 1.00 GiB (GPU 0; 5.81 GiB total capacity; 2.90 GiB already allocated; 420.50 MiB free; 3.84 GiB reserved in total by PyTorch)

cocodataset

After coping the project how do I prepare for coco dataset?

Dataset Generation

Three questions:

  1. In 1_split_filter.py#L46-L48, to my point, sampled image should not contain objects in voc classes. However, this implementation seems only the image with tiny objects will be excluded;

  2. In 2_balance.py#L57, each category only contains no more than 80 instances?

  3. How to generate final_split_voc_10_shot_instances_train2017.json ?

CUDA outofmemory

I do use my own custom coco dataset to train FSOD by 4 gpus (GTX 2080 ti). cost 2315 MB per GPU when trainning the model and works. but when testing It always CUDA out of memory whatever I do to fix the test image size. would you like to give me some advice?

Where can I find the model_final.pth?

Thanks for your wonderful work! I'm wondering how can I get the pretrained model parameters dictionary 'model_final.pth' mentioned in your code. It's 'pth' not 'pkl'. I have downloaded the 'model_final.pkl' from the url of readme.md. Does the 'pth' file is required to train and save by myself? Would you please to share the best 'pth' file? Thanks again!

Yu

About final_split_voc_10_shot_instances_train2017 generation

I don't understand the 33 line in FewX-master.datasets.coco.6_voc_few_shot.py which is used to generate k-shot novel dataset final_split_voc_10_shot_instances_train2017.json file, the line 33 require that the images selected into k-shot novel dataset must be image contains only one object, for what purpose?

TypeError

when I run all.sh,there was a error: TypeError: stat: path should be string, bytes, os.PathLike or integer, not NoneType

whole message as follow:

Rank of current process: 0. World size: 2
Traceback (most recent call last):
File "fsod_train_net.py", line 118, in
args=(args,),
File "/home/ubuntu/anaconda3/envs/pytorch/lib/python3.6/site-packages/detectron2/engine/launch.py", line 59, in launch
daemon=False,
File "/home/ubuntu/anaconda3/envs/pytorch/lib/python3.6/site-packages/torch/multiprocessing/spawn.py", line 199, in spawn
return start_processes(fn, args, nprocs, join, daemon, start_method='spawn')
File "/home/ubuntu/anaconda3/envs/pytorch/lib/python3.6/site-packages/torch/multiprocessing/spawn.py", line 157, in start_processes
while not context.join():
File "/home/ubuntu/anaconda3/envs/pytorch/lib/python3.6/site-packages/torch/multiprocessing/spawn.py", line 118, in join
raise Exception(msg)
Exception:

-- Process 1 terminated with the following error:
Traceback (most recent call last):
File "/home/ubuntu/anaconda3/envs/pytorch/lib/python3.6/site-packages/torch/multiprocessing/spawn.py", line 19, in _wrap
fn(i, *args)
File "/home/ubuntu/anaconda3/envs/pytorch/lib/python3.6/site-packages/detectron2/engine/launch.py", line 94, in _distributed_worker
main_func(*args)
File "/home/ubuntu/FewX/fsod_train_net.py", line 94, in main
cfg = setup(args)
File "/home/ubuntu/FewX/fsod_train_net.py", line 85, in setup
default_setup(cfg, args)
File "/home/ubuntu/anaconda3/envs/pytorch/lib/python3.6/site-packages/detectron2/engine/defaults.py", line 132, in default_setup
logger.info("Environment info:\n" + collect_env_info())
File "/home/ubuntu/anaconda3/envs/pytorch/lib/python3.6/site-packages/detectron2/utils/collect_env.py", line 136, in collect_env_info
msg = " - invalid!" if not os.path.isdir(CUDA_HOME) else ""
File "/home/ubuntu/anaconda3/envs/pytorch/lib/python3.6/genericpath.py", line 42, in isdir
st = os.stat(s)
TypeError: stat: path should be string, bytes, os.PathLike or integer, not NoneType

why the config in the fine-tuning stage only set the K-shot as 9-shot?

Hello, why the config in the fine-tuning stage only set the K-shot as 9-shot?

The config is as below:

BASE: "Base-FSOD-C4.yaml"
MODEL:
WEIGHTS: "./output/fsod/R_50_C4_1x/model_final.pth"
MASK_ON: False
RESNETS:
DEPTH: 50
BACKBONE:
FREEZE_AT: 5
DATASETS:
TRAIN: ("coco_2017_train_voc_10_shot",)
TEST: ("coco_2017_val",)
SOLVER:
IMS_PER_BATCH: 4
BASE_LR: 0.001
STEPS: (2000, 3000)
MAX_ITER: 3000
WARMUP_ITERS: 200
INPUT:
FS:
FEW_SHOT: True
SUPPORT_WAY: 2
SUPPORT_SHOT: 9
MIN_SIZE_TRAIN: (440, 472, 504, 536, 568, 600)
MAX_SIZE_TRAIN: 1000
MIN_SIZE_TEST: 600
MAX_SIZE_TEST: 1000
OUTPUT_DIR: './output/fsod/finetune_dir/R_50_C4_1x'

Occupancy rate of CPU

When I run all.sh, the number of occupancy rate of cpu has become much higher, even close to 90%. But I have no idea to solve this problem, could you give some advice?

Support set for base classes and Limit on no. of ways

Hello. I'm attempting to use a different dataset with the AttentionRPN. The evaluation script uses the pickled dataframe of the 10-shot novel classes, so when I evaluate, it detects only novel classes.

If I create a 10-shot support DF of the base classes and use that, I am able to detect the same. Is this the intended way of evaluating the base classes or have I done something incorrectly? I presume in the final evaluation of novel+base, another support DF needs to be created?

Also - there are places where the no. of ways are fixed to 2 and this is asserted. Is it safe to remove those assertions?

cannot find Attention RPN operation

Hello,thank you for your work! I have a question.I carefully read your impletion of FSOD and I cannot find the Attention RPN Operation in the model. The log shows that this model seems like a Faster R-CNN Model. I cannot find the global average pooling in support image and the conv operation in the query image. How do you implemet this operation in this code? Thank you!

final_split_voc_10_shot_instances_train2017.json

Thanks for your code!
I want to train on my datasets.
However I am not aware of how final_split_voc_10_shot_instances_train2017.json is made.
Could you share the code to me?
Thanks a lot again!

IndexError: list index out of range

When I add a new class to my training set,I run sh all_sh,There was an error:
query_cls = self.support_df.loc[self.support_df['id']==id, 'category_id'].tolist()[0] # they share the same category_id and image_id
IndexError: list index out of range
I checked my dataset_dict and support_df and didn't find anything wrong,so I am confused. Can you give me some advice

Training on FSOD dataset

Thanks for sharing your implementation! Would you also share your code for training on your FSOD dataset? I would like to use your FSOD dataset for my own experiments, but don't know how to get started. I have downloaded it but I am unsure which images I should use as query images. I also don't understand the intention behind the directory structure.
In your paper you wrote that you used a 2-way-5-shot evaluation protocol. How are the 200 test categories split in pairs of two?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.