fanq15 / fewx Goto Github PK
View Code? Open in Web Editor NEWFewX is an open-source toolbox on top of Detectron2 for data-limited instance-level recognition tasks.
Home Page: https://github.com/fanq15/FewX
License: MIT License
FewX is an open-source toolbox on top of Detectron2 for data-limited instance-level recognition tasks.
Home Page: https://github.com/fanq15/FewX
License: MIT License
Traceback (most recent call last):
File "mytest.py", line 253, in
model = init()
File "mytest.py", line 196, in init
predictor = DefaultPredictor(cfg)
File "/usr/local/lib/python3.7/dist-packages/detectron2/engine/defaults.py", line 216, in init
self.model = build_model(self.cfg)
File "/usr/local/lib/python3.7/dist-packages/detectron2/modeling/meta_arch/build.py", line 21, in build_model
model = META_ARCH_REGISTRY.get(meta_arch)(cfg)
File "mytest.py", line 43, in init
self.roi_heads = build_roi_heads(cfg, self.backbone.output_shape())
File "/content/drive/My Drive/FewX/fewx/modeling/fsod/fsod_roi_heads.py", line 44, in build_roi_heads
return ROI_HEADS_REGISTRY.get(name)(cfg, input_shape)
File "/content/drive/My Drive/FewX/fewx/modeling/fsod/fsod_roi_heads.py", line 75, in init
self.res5, out_channels = self._build_res5_block(cfg)
File "/content/drive/My Drive/FewX/fewx/modeling/fsod/fsod_roi_heads.py", line 102, in _build_res5_block
stride_in_1x1=stride_in_1x1,
File "/usr/local/lib/python3.7/dist-packages/detectron2/modeling/backbone/resnet.py", line 609, in make_stage
return ResNet.make_stage(*args, **kwargs)
File "/usr/local/lib/python3.7/dist-packages/detectron2/modeling/backbone/resnet.py", line 541, in make_stage
block_class(in_channels=in_channels, out_channels=out_channels, **curr_kwargs)
TypeError: init() got an unexpected keyword argument 'first_stride'
how to solve this problem?
Thanks for your good work!
I want to use my own data,may be I should make the data format to the coco?
This error comes up while running all.sh
I run all.sh and get nothing. so I use detetron demo predict code to generate the coco result and the resulte is to low.loading annotations into memory...
Done (t=0.50s)
creating index...
index created!
Loading and preparing results...
DONE (t=0.38s)
creating index...
index created!
Running per image evaluation...
Evaluate annotation type bbox
DONE (t=11.88s).
Accumulating evaluation results...
DONE (t=2.87s).
Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.011
Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.020
Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.010
Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.000
Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.006
Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.023
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.027
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.032
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.032
Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.000
Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.010
Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.071
Hi, congratulations on your great work on FSVOD! May I know when will you make it available to the public? Thanks!
With 4 GPUs, I rerun the defaults settings in all.sh and the AP is correct (11.27@voc 20 cats).
However, when I try to use another machines equiped with 2 GPUs, the loss_cls becomes strange and the AP at the end of training is near 0.
Hereby I provide my training log for debugging.
fsod_train_log.txt
In comparison with the logs on 2-GPUmachine and 4-GPU machine, the loss_cls diverges before the iteration 2999 which can be seen as bellows:
2 GPUs
�[32m[08/13 11:32:08 d2.utils.events]: �[0m eta: 2 days, 10:28:56 iter: 2999 total_loss: 0.935 loss_cls: 0.566 loss_box_reg: 0.255 loss_rpn_cls: 0.061 loss_rpn_loc: 0.015 time: 1.8013 data_time: 0.0312 lr: 0.004000 max_mem: 7442M
4 GPUs
�[32m[08/07 11:07:45 d2.utils.events]: �[0m eta: 1 day, 5:50:52 iter: 2999 total_loss: 0.811 loss_cls: 0.476 loss_box_reg: 0.226 loss_rpn_cls: 0.077 loss_rpn_loc: 0.019 time: 0.9198 data_time: 0.0164 lr: 0.004000 max_mem: 4173M
From your code, I do not see anything related to num-gpus. Maybe it is due to the lack of some extra code needed by Detectron2 if num-gpus changes?
Hello.
Where did you sample pairs for "Two-way Contrastive Training Strategy" in the code?
I didn't find the code which samples N : 2N : N ratio.
Thank you.
when I run "sh all,sh",there was an error:
RuntimeError: NCCL error in: /opt/conda/conda-bld/pytorch_1591914742272/work/torch/lib/c10d/ProcessGroupNCCL.cpp:32, unhandled cuda error, NCCL version 2.4.8
I try to train at 2 GPU and change the learn rate,I don't know where problem occured
Hello,thanks for your attention。
I have downloaded the R-50.pkl ,but I don‘t know how to use it.
Could you please tell me how to solve this problem?
Thanks for your code! but I meet this error when I run your code on COCO dataset, could you please tell me what is this file for?and where could I get it?It seems that the file,train_support_df.pkl,and 10_shot_support_df.pkl organize the support data, so I'm confused about this error.
[08/18 14:15:37 d2.evaluation.evaluator]: Start inference on 5000 images
/home/ly/anaconda3/envs/py3_torch151/lib/python3.7/site-packages/setuptools/distutils_patch.py:26: UserWarning: Distutils was imported before Setuptools. This usage is discouraged and may exhibit undesirable behaviors or errors. Please use Setuptools' objects directly or at least import Setuptools first.
"Distutils was imported before Setuptools. This usage is discouraged "
Traceback (most recent call last):
File "fsod_train_net.py", line 118, in
args=(args,),
File "/home/ly/anaconda3/envs/py3_torch151/lib/python3.7/site-packages/detectron2/engine/launch.py", line 62, in launch
main_func(*args)
File "fsod_train_net.py", line 101, in main
res = Trainer.test(cfg, model)
File "/home/ly/anaconda3/envs/py3_torch151/lib/python3.7/site-packages/detectron2/engine/defaults.py", line 516, in test
results_i = inference_on_dataset(model, data_loader, evaluator)
File "/home/ly/anaconda3/envs/py3_torch151/lib/python3.7/site-packages/detectron2/evaluation/evaluator.py", line 141, in inference_on_dataset
outputs = model(inputs)
File "/home/ly/anaconda3/envs/py3_torch151/lib/python3.7/site-packages/torch/nn/modules/module.py", line 550, in call
result = self.forward(*input, **kwargs)
File "/home/ly/few-shot-object-detection/FewX-master/fewx/modeling/fsod/fsod_rcnn.py", line 126, in forward
self.init_model()
File "/home/ly/few-shot-object-detection/FewX-master/fewx/modeling/fsod/fsod_rcnn.py", line 302, in init_model
with open(support_file_name, 'wb') as f:
FileNotFoundError: [Errno 2] No such file or directory: './support_dir/support_feature.pkl'
Dear author, thanks for you great work. Currently I am trying to run your code but always report NaN error, the following is the error traceback, can you have a look? Thanks in advance!
detectron2/detectron2/modeling/proposal_generator/proposal_utils.py", line 103, in find_top_rpn_
proposals
raise FloatingPointError(
FloatingPointError: Predicted boxes or scores contain Inf/NaN. Training has diverged.
Hello,
Given the provided model (model_final.pth), how would I predict on new images?
Hi~
I was trying to apply your fsod model to new dataset. But encounter some problems when registering dataset. I have read some documents about Detectron2, still got confused. So, would you be so kind to explain the dataset regestering procedure?
Thanks a lot.
I follow the step 2 to generate final_split_non_voc_instances_train2017.json. The result dataset is slightly different to you training log. I am wondering whether I missed something? Thanks.
| category | #instances | category | #instances | category | #instances | |:-------------:|:-------------|:------------:|:-------------|:-------------:|:-------------| | person | 0 | bicycle | 0 | car | 0 | | motorcycle | 0 | airplane | 0 | bus | 0 | | train | 0 | truck | 4483 | boat | 0 | | traffic light | 1545 | fire hydrant | 1172 | stop sign | 1130 | | parking meter | 511 | bench | 3977 | bird | 0 | | cat | 0 | dog | 0 | horse | 0 | | sheep | 0 | cow | 0 | elephant | 4007 | | bear | 1241 | zebra | 4090 | giraffe | 4481 | | backpack | 2632 | umbrella | 3634 | handbag | 3029 | | tie | 2666 | suitcase | 3102 | frisbee | 1343 | | skis | 2570 | snowboard | 1296 | sports ball | 848 | | kite | 1327 | baseball bat | 996 | baseball gl.. | 749 | | skateboard | 2922 | surfboard | 3377 | tennis racket | 1693 | | bottle | 0 | wine glass | 2069 | cup | 6270 | | fork | 2498 | knife | 2743 | spoon | 2037 | | bowl | 5207 | banana | 3753 | apple | 2231 | | sandwich | 2972 | orange | 2765 | broccoli | 4347 | | carrot | 2964 | hot dog | 1829 | pizza | 3846 | | donut | 4160 | cake | 3422 | chair | 0 | | couch | 0 | potted plant | 0 | bed | 3005 | | dining table | 0 | toilet | 3464 | tv | 0 | | laptop | 2289 | mouse | 918 | remote | 1480 | | keyboard | 1351 | cell phone | 2602 | microwave | 670 | | oven | 1346 | toaster | 73 | sink | 2858 | | refrigerator | 1271 | book | 3823 | clock | 2682 | | vase | 2982 | scissors | 968 | teddy bear | 3386 | | hair drier | 119 | toothbrush | 783 | | | | total | 148004 | | | | |
Thanks for your release.
Would you mind specify a detectron2 (d2) commit number as d2 continues developing and some break-through changes may be made which would potentially influence the reproducibility.
Thanks in advance.
Thanks for your inspiring work!
I have a question about the trained models your provided.
Based on my understanding, the two provided pth files in google drive are trained on COCO, a base one and a fine-tuned one. But where can I find the one trained on FSOD dataset(w/o fine-tune )? Appreciate for any clarifications! @fanq15
Thanks for your code! When I try to run it on my dataset, the error output is:
File "FewX-master/fewx/data/dataset_mapper.py", line 234, in generate_support
support_id= self.support_df.loc[
(self.support_df['category_id'] == query_cls) &
(~self.support_df['image_id'].isin(used_image_id))
& (~self.support_df['id'].isin(used_id_ls))
, 'id'].sample(random_state=id).tolist()[0]
1)Is there anything wrong with generating pickle file?
and if i change the code to : support_id= self.support_df.loc[
(self.support_df['category_id'] == query_cls) , 'id'].sample(random_state=id).tolist()[0]
this error happens:
detectron2/detectron2/modeling/proposal_generator/proposal_utils.py", line 95, in find_top_rpn_proposals
"Predicted boxes or scores contain Inf/NaN. Training has diverged."
FloatingPointError: Predicted boxes or scores contain Inf/NaN. Training has diverged.
2) Could you give some suggestions?
this project can not detect my own picture and my custum class, could you release the code ?meanshile, it cannot train your fsod data.
If not, would you please provide the model after base classes training but before fine-tuning? Thanks!
Hi.
Training and evaluation were conduct using the model. I'd like to check the bounding box on the image through the test, so could you tell me what to do?
I'd really appreciate it if you'd let me know.
Would you mind giving a clarification about the generation process of new_annotations/final_split_voc_10_shot_instances_train2017.json? Thanks in advance.
FewX/fewx/data/dataset_mapper.py
Line 242 in 3392c74
FewX/fewx/data/dataset_mapper.py
Line 215 in 3392c74
Hello,thank you for your work! I have a question. Do query images in your code come from coco val2017?
Dear Qi Fan,
This is a great work on few shot learning task! But I found that it seems only FSOD for image object detection is implemented, do you have a plan to release the FSVOD code for video object detection, and what time is expected, thanks!
How to visualize the bounding boxes with scores?
Nice work. How would you extend data prep and the rest of the pipeline for a custom dataset? assuming the same N-way K-shot (N,K) as CoCo or VOC are considered and the number of classes are the same too. Maybe good to add a section in the readme file on this one.
When I try to download the FSYTV-40-images from this link, https://drive.google.com/drive/folders/1a1PpfAxeYL7AbxYViDDnx7ACFtRohVL5, it shows the FSYTV-40-images is a 149bite file, no image data.
My graphics boards is gtx1660ti, memory 6G.
I run this code to report an error:
RuntimeError: CUDA out of memory. Tried to allocate 1.00 GiB (GPU 0; 5.81 GiB total capacity; 2.90 GiB already allocated; 420.50 MiB free; 3.84 GiB reserved in total by PyTorch)
I want to use fewx to train my datasets. Could you release the code about how to generate "final_split_voc_10_shot_instances_train2017.json"?
After coping the project how do I prepare for coco dataset?
Three questions:
In 1_split_filter.py#L46-L48, to my point, sampled image should not contain objects in voc classes. However, this implementation seems only the image with tiny objects will be excluded;
In 2_balance.py#L57, each category only contains no more than 80 instances?
How to generate final_split_voc_10_shot_instances_train2017.json ?
I do use my own custom coco dataset to train FSOD by 4 gpus (GTX 2080 ti). cost 2315 MB per GPU when trainning the model and works. but when testing It always CUDA out of memory whatever I do to fix the test image size. would you like to give me some advice?
How do I change the number of GPU?
Thanks for your wonderful work! I'm wondering how can I get the pretrained model parameters dictionary 'model_final.pth' mentioned in your code. It's 'pth' not 'pkl'. I have downloaded the 'model_final.pkl' from the url of readme.md. Does the 'pth' file is required to train and save by myself? Would you please to share the best 'pth' file? Thanks again!
Yu
I don't understand the 33 line in FewX-master.datasets.coco.6_voc_few_shot.py which is used to generate k-shot novel dataset final_split_voc_10_shot_instances_train2017.json file, the line 33 require that the images selected into k-shot novel dataset must be image contains only one object, for what purpose?
Both basic training and finetuning end with AssertionError thrown from this line. Does this mean that training ends correctly?
when I run all.sh,there was a error: TypeError: stat: path should be string, bytes, os.PathLike or integer, not NoneType
whole message as follow:
Rank of current process: 0. World size: 2
Traceback (most recent call last):
File "fsod_train_net.py", line 118, in
args=(args,),
File "/home/ubuntu/anaconda3/envs/pytorch/lib/python3.6/site-packages/detectron2/engine/launch.py", line 59, in launch
daemon=False,
File "/home/ubuntu/anaconda3/envs/pytorch/lib/python3.6/site-packages/torch/multiprocessing/spawn.py", line 199, in spawn
return start_processes(fn, args, nprocs, join, daemon, start_method='spawn')
File "/home/ubuntu/anaconda3/envs/pytorch/lib/python3.6/site-packages/torch/multiprocessing/spawn.py", line 157, in start_processes
while not context.join():
File "/home/ubuntu/anaconda3/envs/pytorch/lib/python3.6/site-packages/torch/multiprocessing/spawn.py", line 118, in join
raise Exception(msg)
Exception:
-- Process 1 terminated with the following error:
Traceback (most recent call last):
File "/home/ubuntu/anaconda3/envs/pytorch/lib/python3.6/site-packages/torch/multiprocessing/spawn.py", line 19, in _wrap
fn(i, *args)
File "/home/ubuntu/anaconda3/envs/pytorch/lib/python3.6/site-packages/detectron2/engine/launch.py", line 94, in _distributed_worker
main_func(*args)
File "/home/ubuntu/FewX/fsod_train_net.py", line 94, in main
cfg = setup(args)
File "/home/ubuntu/FewX/fsod_train_net.py", line 85, in setup
default_setup(cfg, args)
File "/home/ubuntu/anaconda3/envs/pytorch/lib/python3.6/site-packages/detectron2/engine/defaults.py", line 132, in default_setup
logger.info("Environment info:\n" + collect_env_info())
File "/home/ubuntu/anaconda3/envs/pytorch/lib/python3.6/site-packages/detectron2/utils/collect_env.py", line 136, in collect_env_info
msg = " - invalid!" if not os.path.isdir(CUDA_HOME) else ""
File "/home/ubuntu/anaconda3/envs/pytorch/lib/python3.6/genericpath.py", line 42, in isdir
st = os.stat(s)
TypeError: stat: path should be string, bytes, os.PathLike or integer, not NoneType
Hello, why the config in the fine-tuning stage only set the K-shot as 9-shot?
The config is as below:
BASE: "Base-FSOD-C4.yaml"
MODEL:
WEIGHTS: "./output/fsod/R_50_C4_1x/model_final.pth"
MASK_ON: False
RESNETS:
DEPTH: 50
BACKBONE:
FREEZE_AT: 5
DATASETS:
TRAIN: ("coco_2017_train_voc_10_shot",)
TEST: ("coco_2017_val",)
SOLVER:
IMS_PER_BATCH: 4
BASE_LR: 0.001
STEPS: (2000, 3000)
MAX_ITER: 3000
WARMUP_ITERS: 200
INPUT:
FS:
FEW_SHOT: True
SUPPORT_WAY: 2
SUPPORT_SHOT: 9
MIN_SIZE_TRAIN: (440, 472, 504, 536, 568, 600)
MAX_SIZE_TRAIN: 1000
MIN_SIZE_TEST: 600
MAX_SIZE_TEST: 1000
OUTPUT_DIR: './output/fsod/finetune_dir/R_50_C4_1x'
When I run all.sh, the number of occupancy rate of cpu has become much higher, even close to 90%. But I have no idea to solve this problem, could you give some advice?
Hello. I'm attempting to use a different dataset with the AttentionRPN. The evaluation script uses the pickled dataframe of the 10-shot novel classes, so when I evaluate, it detects only novel classes.
If I create a 10-shot support DF of the base classes and use that, I am able to detect the same. Is this the intended way of evaluating the base classes or have I done something incorrectly? I presume in the final evaluation of novel+base, another support DF needs to be created?
Also - there are places where the no. of ways are fixed to 2 and this is asserted. Is it safe to remove those assertions?
Hi,我在不同机器训练了两次,一次得到了11.45%的准确率,一次是11.40%,请问这个训练结果是不稳定吗,应该怎么得到 12% 的 AP 呢
fsod_train_log.txt
fsod_finetune_test_log.txt
fsod_finetune_train_log.txt
Hello,thank you for your work! I have a question.I carefully read your impletion of FSOD and I cannot find the Attention RPN Operation in the model. The log shows that this model seems like a Faster R-CNN Model. I cannot find the global average pooling in support image and the conv operation in the query image. How do you implemet this operation in this code? Thank you!
Thanks for your code!
I want to train on my datasets.
However I am not aware of how final_split_voc_10_shot_instances_train2017.json is made.
Could you share the code to me?
Thanks a lot again!
When I add a new class to my training set,I run sh all_sh,There was an error:
query_cls = self.support_df.loc[self.support_df['id']==id, 'category_id'].tolist()[0] # they share the same category_id and image_id
IndexError: list index out of range
I checked my dataset_dict and support_df and didn't find anything wrong,so I am confused. Can you give me some advice
Thanks for sharing your implementation! Would you also share your code for training on your FSOD dataset? I would like to use your FSOD dataset for my own experiments, but don't know how to get started. I have downloaded it but I am unsure which images I should use as query images. I also don't understand the intention behind the directory structure.
In your paper you wrote that you used a 2-way-5-shot evaluation protocol. How are the 200 test categories split in pairs of two?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.