fanq15 / fewx Goto Github PK

View Code? Open in Web Editor NEW

345.0 345.0 48.0 4.75 MB

FewX is an open-source toolbox on top of Detectron2 for data-limited instance-level recognition tasks.

Home Page: https://github.com/fanq15/FewX

License: MIT License

Shell 0.70% Python 99.30%

few-shot few-shot-instance-segmentation few-shot-object-detection partially-supervised

fewx's People

Contributors

Stargazers

Watchers

fewx's Issues

init() got an unexpected keyword argument 'first_stride'

Traceback (most recent call last):
File "mytest.py", line 253, in
model = init()
File "mytest.py", line 196, in init
predictor = DefaultPredictor(cfg)
File "/usr/local/lib/python3.7/dist-packages/detectron2/engine/defaults.py", line 216, in init
self.model = build_model(self.cfg)
File "/usr/local/lib/python3.7/dist-packages/detectron2/modeling/meta_arch/build.py", line 21, in build_model
model = META_ARCH_REGISTRY.get(meta_arch)(cfg)
File "mytest.py", line 43, in init
self.roi_heads = build_roi_heads(cfg, self.backbone.output_shape())
File "/content/drive/My Drive/FewX/fewx/modeling/fsod/fsod_roi_heads.py", line 44, in build_roi_heads
return ROI_HEADS_REGISTRY.get(name)(cfg, input_shape)
File "/content/drive/My Drive/FewX/fewx/modeling/fsod/fsod_roi_heads.py", line 75, in init
self.res5, out_channels = self._build_res5_block(cfg)
File "/content/drive/My Drive/FewX/fewx/modeling/fsod/fsod_roi_heads.py", line 102, in _build_res5_block
stride_in_1x1=stride_in_1x1,
File "/usr/local/lib/python3.7/dist-packages/detectron2/modeling/backbone/resnet.py", line 609, in make_stage
return ResNet.make_stage(*args, **kwargs)
File "/usr/local/lib/python3.7/dist-packages/detectron2/modeling/backbone/resnet.py", line 541, in make_stage
block_class(in_channels=in_channels, out_channels=out_channels, **curr_kwargs)
TypeError: init() got an unexpected keyword argument 'first_stride'

how to solve this problem?

the question of data

Thanks for your good work!

I want to use my own data,may be I should make the data format to the coco?

ImportError: libtorch_cpu.so: cannot open shared object file: No such file or directory

This error comes up while running all.sh

how to get 12 map？

I run all.sh and get nothing. so I use detetron demo predict code to generate the coco result and the resulte is to low.loading annotations into memory...
Done (t=0.50s)
creating index...
index created!
Loading and preparing results...
DONE (t=0.38s)
creating index...
index created!
Running per image evaluation...
Evaluate annotation type bbox
DONE (t=11.88s).
Accumulating evaluation results...
DONE (t=2.87s).
Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.011
Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.020
Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.010
Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.000
Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.006
Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.023
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.027
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.032
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.032
Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.000
Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.010
Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.071

When will you make FSVOD dataset available?

Hi, congratulations on your great work on FSVOD! May I know when will you make it available to the public? Thanks!

Issues with num_gpus

With 4 GPUs, I rerun the defaults settings in all.sh and the AP is correct (11.27@voc 20 cats).

However, when I try to use another machines equiped with 2 GPUs, the loss_cls becomes strange and the AP at the end of training is near 0.

Hereby I provide my training log for debugging.
fsod_train_log.txt

In comparison with the logs on 2-GPUmachine and 4-GPU machine, the loss_cls diverges before the iteration 2999 which can be seen as bellows:

2 GPUs

�[32m[08/13 11:32:08 d2.utils.events]: �[0m eta: 2 days, 10:28:56  iter: 2999  total_loss: 0.935  loss_cls: 0.566  loss_box_reg: 0.255  loss_rpn_cls: 0.061  loss_rpn_loc: 0.015  time: 1.8013  data_time: 0.0312  lr: 0.004000  max_mem: 7442M

4 GPUs

�[32m[08/07 11:07:45 d2.utils.events]: �[0m eta: 1 day, 5:50:52  iter: 2999  total_loss: 0.811  loss_cls: 0.476  loss_box_reg: 0.226  loss_rpn_cls: 0.077  loss_rpn_loc: 0.019  time: 0.9198  data_time: 0.0164  lr: 0.004000  max_mem: 4173M

From your code, I do not see anything related to num-gpus. Maybe it is due to the lack of some extra code needed by Detectron2 if num-gpus changes?

Question about "Two-way contrastive Training Strategy"

Hello.

Where did you sample pairs for "Two-way Contrastive Training Strategy" in the code?

I didn't find the code which samples N : 2N : N ratio.

Thank you.

Illegal instruction (core dumped)， here are two plots which describe my problems, could you please have a look? Thank you !!

RuntimeError

when I run "sh all,sh",there was an error:
RuntimeError: NCCL error in: /opt/conda/conda-bld/pytorch_1591914742272/work/torch/lib/c10d/ProcessGroupNCCL.cpp:32, unhandled cuda error, NCCL version 2.4.8

I try to train at 2 GPU and change the learn rate,I don't know where problem occured

Checkpoint detectron2://ImageNetPretrained/MSRA/R-50.pkl not found!

Hello，thanks for your attention。
I have downloaded the R-50.pkl ，but I don‘t know how to use it.
Could you please tell me how to solve this problem?

No such file or directory: './support_dir/support_feature.pkl'

Thanks for your code! but I meet this error when I run your code on COCO dataset, could you please tell me what is this file for？and where could I get it？It seems that the file，train_support_df.pkl，and 10_shot_support_df.pkl organize the support data, so I'm confused about this error.

[08/18 14:15:37 d2.evaluation.evaluator]: Start inference on 5000 images
/home/ly/anaconda3/envs/py3_torch151/lib/python3.7/site-packages/setuptools/distutils_patch.py:26: UserWarning: Distutils was imported before Setuptools. This usage is discouraged and may exhibit undesirable behaviors or errors. Please use Setuptools' objects directly or at least import Setuptools first.
"Distutils was imported before Setuptools. This usage is discouraged "
Traceback (most recent call last):
File "fsod_train_net.py", line 118, in
args=(args,),
File "/home/ly/anaconda3/envs/py3_torch151/lib/python3.7/site-packages/detectron2/engine/launch.py", line 62, in launch
main_func(*args)
File "fsod_train_net.py", line 101, in main
res = Trainer.test(cfg, model)
File "/home/ly/anaconda3/envs/py3_torch151/lib/python3.7/site-packages/detectron2/engine/defaults.py", line 516, in test
results_i = inference_on_dataset(model, data_loader, evaluator)
File "/home/ly/anaconda3/envs/py3_torch151/lib/python3.7/site-packages/detectron2/evaluation/evaluator.py", line 141, in inference_on_dataset
outputs = model(inputs)
File "/home/ly/anaconda3/envs/py3_torch151/lib/python3.7/site-packages/torch/nn/modules/module.py", line 550, in call
result = self.forward(*input, **kwargs)
File "/home/ly/few-shot-object-detection/FewX-master/fewx/modeling/fsod/fsod_rcnn.py", line 126, in forward
self.init_model()
File "/home/ly/few-shot-object-detection/FewX-master/fewx/modeling/fsod/fsod_rcnn.py", line 302, in init_model
with open(support_file_name, 'wb') as f:

FileNotFoundError: [Errno 2] No such file or directory: './support_dir/support_feature.pkl'

NaN

Dear author, thanks for you great work. Currently I am trying to run your code but always report NaN error, the following is the error traceback, can you have a look? Thanks in advance!

detectron2/detectron2/modeling/proposal_generator/proposal_utils.py", line 103, in find_top_rpn_
proposals
    raise FloatingPointError(
FloatingPointError: Predicted boxes or scores contain Inf/NaN. Training has diverged.

Predict on new images

Hello,

Given the provided model (model_final.pth), how would I predict on new images?

Apply model to new dataset

Hi~

I was trying to apply your fsod model to new dataset. But encounter some problems when registering dataset. I have read some documents about Detectron2, still got confused. So, would you be so kind to explain the dataset regestering procedure?

Thanks a lot.

Difference between final_split_non_voc_instances_train2017.json and your log file

I follow the step 2 to generate final_split_non_voc_instances_train2017.json. The result dataset is slightly different to you training log. I am wondering whether I missed something? Thanks.

| category | #instances | category | #instances | category | #instances | |:-------------:|:-------------|:------------:|:-------------|:-------------:|:-------------| | person | 0 | bicycle | 0 | car | 0 | | motorcycle | 0 | airplane | 0 | bus | 0 | | train | 0 | truck | 4483 | boat | 0 | | traffic light | 1545 | fire hydrant | 1172 | stop sign | 1130 | | parking meter | 511 | bench | 3977 | bird | 0 | | cat | 0 | dog | 0 | horse | 0 | | sheep | 0 | cow | 0 | elephant | 4007 | | bear | 1241 | zebra | 4090 | giraffe | 4481 | | backpack | 2632 | umbrella | 3634 | handbag | 3029 | | tie | 2666 | suitcase | 3102 | frisbee | 1343 | | skis | 2570 | snowboard | 1296 | sports ball | 848 | | kite | 1327 | baseball bat | 996 | baseball gl.. | 749 | | skateboard | 2922 | surfboard | 3377 | tennis racket | 1693 | | bottle | 0 | wine glass | 2069 | cup | 6270 | | fork | 2498 | knife | 2743 | spoon | 2037 | | bowl | 5207 | banana | 3753 | apple | 2231 | | sandwich | 2972 | orange | 2765 | broccoli | 4347 | | carrot | 2964 | hot dog | 1829 | pizza | 3846 | | donut | 4160 | cake | 3422 | chair | 0 | | couch | 0 | potted plant | 0 | bed | 3005 | | dining table | 0 | toilet | 3464 | tv | 0 | | laptop | 2289 | mouse | 918 | remote | 1480 | | keyboard | 1351 | cell phone | 2602 | microwave | 670 | | oven | 1346 | toaster | 73 | sink | 2858 | | refrigerator | 1271 | book | 3823 | clock | 2682 | | vase | 2982 | scissors | 968 | teddy bear | 3386 | | hair drier | 119 | toothbrush | 783 | | | | total | 148004 | | | | |

About detectron2 dependency

Thanks for your release.
Would you mind specify a detectron2 (d2) commit number as d2 continues developing and some break-through changes may be made which would potentially influence the reproducibility.
Thanks in advance.

About trained model

Thanks for your inspiring work!
I have a question about the trained models your provided.
Based on my understanding, the two provided pth files in google drive are trained on COCO, a base one and a fine-tuned one. But where can I find the one trained on FSOD dataset(w/o fine-tune )? Appreciate for any clarifications! @fanq15

trainingg end with nan loss

Thanks for your code! When I try to run it on my dataset, the error output is:
File "FewX-master/fewx/data/dataset_mapper.py", line 234, in generate_support
support_id= self.support_df.loc[
(self.support_df['category_id'] == query_cls) &
(~self.support_df['image_id'].isin(used_image_id))
& (~self.support_df['id'].isin(used_id_ls))
, 'id'].sample(random_state=id).tolist()[0]

1)Is there anything wrong with generating pickle file?
and if i change the code to : support_id= self.support_df.loc[
(self.support_df['category_id'] == query_cls) , 'id'].sample(random_state=id).tolist()[0]
this error happens:
detectron2/detectron2/modeling/proposal_generator/proposal_utils.py", line 95, in find_top_rpn_proposals
"Predicted boxes or scores contain Inf/NaN. Training has diverged."
FloatingPointError: Predicted boxes or scores contain Inf/NaN. Training has diverged.
2) Could you give some suggestions?

how to detect my own picture

this project can not detect my own picture and my custum class, could you release the code ?meanshile, it cannot train your fsod data.

is model_final.pth before fine-tuning?

If not, would you please provide the model after base classes training but before fine-tuning? Thanks!

How to test?

Hi.
Training and evaluation were conduct using the model. I'd like to check the bounding box on the image through the test, so could you tell me what to do?

I'd really appreciate it if you'd let me know.

how long will support fsod dataset and custom class?

fsod is much better than coco ，how long will support fsod dataset? and when you release test custom class and custom picture code？ acturely, it is very important.

About final_split_voc_10_shot_instances_train2017.json

Would you mind giving a clarification about the generation process of new_annotations/final_split_voc_10_shot_instances_train2017.json? Thanks in advance.

Why the mapper need to set random seed when select support images.

FewX/fewx/data/dataset_mapper.py

Line 242 in 3392c74

 support_id = self.support_df.loc[(self.support_df['category_id'] == other_cls) & (~self.support_df['image_id'].isin(used_image_id)) & (~self.support_df['id'].isin(used_id_ls)), 'id'].sample(random_state=id).tolist()[0] 

FewX/fewx/data/dataset_mapper.py

Line 215 in 3392c74

 support_id = self.support_df.loc[(self.support_df['category_id'] == query_cls) & (~self.support_df['image_id'].isin(used_image_id)) & (~self.support_df['id'].isin(used_id_ls)), 'id'].sample(random_state=id).tolist()[0] 

Query images

Hello，thank you for your work! I have a question. Do query images in your code come from coco val2017?

FSVOD code

Dear Qi Fan,

This is a great work on few shot learning task! But I found that it seems only FSOD for image object detection is implemented, do you have a plan to release the FSVOD code for video object detection, and what time is expected, thanks!

bounding box

How to visualize the bounding boxes with scores?

Custom Dataset

Nice work. How would you extend data prep and the rest of the pipeline for a custom dataset? assuming the same N-way K-shot (N,K) as CoCo or VOC are considered and the number of classes are the same too. Maybe good to add a section in the readme file on this one.

FSYTV-40-images is a 149bite file, no image data

When I try to download the FSYTV-40-images from this link, https://drive.google.com/drive/folders/1a1PpfAxeYL7AbxYViDDnx7ACFtRohVL5, it shows the FSYTV-40-images is a 149bite file, no image data.

How much memory is needed for infer？

My graphics boards is gtx1660ti, memory 6G.
I run this code to report an error:
RuntimeError: CUDA out of memory. Tried to allocate 1.00 GiB (GPU 0; 5.81 GiB total capacity; 2.90 GiB already allocated; 420.50 MiB free; 3.84 GiB reserved in total by PyTorch)

How to generate "final_split_voc_10_shot_instances_train2017.json"

I want to use fewx to train my datasets. Could you release the code about how to generate "final_split_voc_10_shot_instances_train2017.json"?

cocodataset

After coping the project how do I prepare for coco dataset?

Dataset Generation

Three questions:

In 1_split_filter.py#L46-L48, to my point, sampled image should not contain objects in voc classes. However, this implementation seems only the image with tiny objects will be excluded;
In 2_balance.py#L57, each category only contains no more than 80 instances?
How to generate final_split_voc_10_shot_instances_train2017.json ?

CUDA outofmemory

I do use my own custom coco dataset to train FSOD by 4 gpus (GTX 2080 ti). cost 2315 MB per GPU when trainning the model and works. but when testing It always CUDA out of memory whatever I do to fix the test image size. would you like to give me some advice?

num of GPU

How do I change the number of GPU?

Where can I find the model_final.pth?

Thanks for your wonderful work! I'm wondering how can I get the pretrained model parameters dictionary 'model_final.pth' mentioned in your code. It's 'pth' not 'pkl'. I have downloaded the 'model_final.pkl' from the url of readme.md. Does the 'pth' file is required to train and save by myself? Would you please to share the best 'pth' file? Thanks again!

About final_split_voc_10_shot_instances_train2017 generation

I don't understand the 33 line in FewX-master.datasets.coco.6_voc_few_shot.py which is used to generate k-shot novel dataset final_split_voc_10_shot_instances_train2017.json file, the line 33 require that the images selected into k-shot novel dataset must be image contains only one object, for what purpose?

training end with AssertionError

Both basic training and finetuning end with AssertionError thrown from this line. Does this mean that training ends correctly?

How to use this repo on custom dataset which is in coco format?

@fanq15 @Ze-Yang
Could you please let me how to

run this repo on custom dataset of my own classes?
Run this repo on sample dataset which is a subset of coco dataset for just two classes

Can we apply your new dataset to FewX to train and test?

TypeError

when I run all.sh,there was a error: TypeError: stat: path should be string, bytes, os.PathLike or integer, not NoneType

whole message as follow:

Rank of current process: 0. World size: 2
Traceback (most recent call last):
File "fsod_train_net.py", line 118, in
args=(args,),
File "/home/ubuntu/anaconda3/envs/pytorch/lib/python3.6/site-packages/detectron2/engine/launch.py", line 59, in launch
daemon=False,
File "/home/ubuntu/anaconda3/envs/pytorch/lib/python3.6/site-packages/torch/multiprocessing/spawn.py", line 199, in spawn
return start_processes(fn, args, nprocs, join, daemon, start_method='spawn')
File "/home/ubuntu/anaconda3/envs/pytorch/lib/python3.6/site-packages/torch/multiprocessing/spawn.py", line 157, in start_processes
while not context.join():
File "/home/ubuntu/anaconda3/envs/pytorch/lib/python3.6/site-packages/torch/multiprocessing/spawn.py", line 118, in join
raise Exception(msg)
Exception:

-- Process 1 terminated with the following error:
Traceback (most recent call last):
File "/home/ubuntu/anaconda3/envs/pytorch/lib/python3.6/site-packages/torch/multiprocessing/spawn.py", line 19, in _wrap
fn(i, *args)
File "/home/ubuntu/anaconda3/envs/pytorch/lib/python3.6/site-packages/detectron2/engine/launch.py", line 94, in _distributed_worker
main_func(*args)
File "/home/ubuntu/FewX/fsod_train_net.py", line 94, in main
cfg = setup(args)
File "/home/ubuntu/FewX/fsod_train_net.py", line 85, in setup
default_setup(cfg, args)
File "/home/ubuntu/anaconda3/envs/pytorch/lib/python3.6/site-packages/detectron2/engine/defaults.py", line 132, in default_setup
logger.info("Environment info:\n" + collect_env_info())
File "/home/ubuntu/anaconda3/envs/pytorch/lib/python3.6/site-packages/detectron2/utils/collect_env.py", line 136, in collect_env_info
msg = " - invalid!" if not os.path.isdir(CUDA_HOME) else ""
File "/home/ubuntu/anaconda3/envs/pytorch/lib/python3.6/genericpath.py", line 42, in isdir
st = os.stat(s)
TypeError: stat: path should be string, bytes, os.PathLike or integer, not NoneType

why the config in the fine-tuning stage only set the K-shot as 9-shot?

Hello, why the config in the fine-tuning stage only set the K-shot as 9-shot?

The config is as below:

BASE: "Base-FSOD-C4.yaml"
MODEL:
WEIGHTS: "./output/fsod/R_50_C4_1x/model_final.pth"
MASK_ON: False
RESNETS:
DEPTH: 50
BACKBONE:
FREEZE_AT: 5
DATASETS:
TRAIN: ("coco_2017_train_voc_10_shot",)
TEST: ("coco_2017_val",)
SOLVER:
IMS_PER_BATCH: 4
BASE_LR: 0.001
STEPS: (2000, 3000)
MAX_ITER: 3000
WARMUP_ITERS: 200
INPUT:
FS:
FEW_SHOT: True
SUPPORT_WAY: 2
SUPPORT_SHOT: 9
MIN_SIZE_TRAIN: (440, 472, 504, 536, 568, 600)
MAX_SIZE_TRAIN: 1000
MIN_SIZE_TEST: 600
MAX_SIZE_TEST: 1000
OUTPUT_DIR: './output/fsod/finetune_dir/R_50_C4_1x'

Occupancy rate of CPU

When I run all.sh, the number of occupancy rate of cpu has become much higher, even close to 90%. But I have no idea to solve this problem, could you give some advice?

Support set for base classes and Limit on no. of ways

Hello. I'm attempting to use a different dataset with the AttentionRPN. The evaluation script uses the pickled dataframe of the 10-shot novel classes, so when I evaluate, it detects only novel classes.

If I create a 10-shot support DF of the base classes and use that, I am able to detect the same. Is this the intended way of evaluating the base classes or have I done something incorrectly? I presume in the final evaluation of novel+base, another support DF needs to be created?

Also - there are places where the no. of ways are fixed to 2 and this is asserted. Is it safe to remove those assertions?

When will support fsod dataset？

I get AP 11.45. How to get AP 12?

Hi，我在不同机器训练了两次，一次得到了11.45%的准确率，一次是11.40%，请问这个训练结果是不稳定吗，应该怎么得到 12% 的 AP 呢

fsod_train_log.txt
fsod_finetune_test_log.txt
fsod_finetune_train_log.txt

cannot find Attention RPN operation

Hello，thank you for your work! I have a question.I carefully read your impletion of FSOD and I cannot find the Attention RPN Operation in the model. The log shows that this model seems like a Faster R-CNN Model. I cannot find the global average pooling in support image and the conv operation in the query image. How do you implemet this operation in this code? Thank you!

final_split_voc_10_shot_instances_train2017.json

Thanks for your code!
I want to train on my datasets.
However I am not aware of how final_split_voc_10_shot_instances_train2017.json is made.
Could you share the code to me?
Thanks a lot again!

IndexError: list index out of range

When I add a new class to my training set,I run sh all_sh,There was an error:
query_cls = self.support_df.loc[self.support_df['id']==id, 'category_id'].tolist()[0] # they share the same category_id and image_id
IndexError: list index out of range
I checked my dataset_dict and support_df and didn't find anything wrong,so I am confused. Can you give me some advice

Training on FSOD dataset

Thanks for sharing your implementation! Would you also share your code for training on your FSOD dataset? I would like to use your FSOD dataset for my own experiments, but don't know how to get started. I have downloaded it but I am unsure which images I should use as query images. I also don't understand the intention behind the directory structure.
In your paper you wrote that you used a 2-way-5-shot evaluation protocol. How are the 200 test categories split in pairs of two?

fanq15 / fewx Goto Github PK

fewx's People

Contributors

Stargazers

Watchers

Forkers

fewx's Issues

Recommend Projects

Recommend Topics

Recommend Org