Giter Club home page Giter Club logo

fastinst's People

Contributors

junjiehe96 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

fastinst's Issues

GPU computational resources quantity

Dear Authors,
Thank you for your very interesting work and source code.

Could you please confirm the number of GPUs used in the training process? Whether it is 1x V100 or 4x V100?
In the paper, it is indicated that 1x A100 GPU is used to evaluate and infer. But in the source code, 4x GPU is pre-set up in the script.

Many thanks in advance.

demo.py可视化,阈值没有发挥作用,掩码重叠

./demo/demo.py
--config-file .n/configs/coco/instance-segmentation/fastinst_R50-vd-dcn_ppm-fpn_x3_640.yaml
--input /hy-tmp/datasets/coco/val2017/000000001818.jpg
--output ./可视化结果
--confidence-threshold 0.5
--opts MODEL.WEIGHTS ./fastinst_R50-vd-dcn_ppm-fpn_x3_640_40.1.pth
000000001818

预测图片

如何使用训练好的模型预测一张图片呢?

About FPS

Hi, I tried to reproduce the results of FastInst on CoCo, mainly on FPS (using FastInst-Res50-D3, and --eval-only mode, --num-gpus 1).

  1. On a two 3090 GPUs server and a two A6000 GPUs server, both of them reported pure inference time of ~0.022s (about 45 FPS), which
    is much higher than that reported in paper and main page of FastInst github. Is there any other post-processing task that didn't count
    by 'pure inference time'? If True, could you please guide me how to count the full inference time. Or it just caused by
    GPU/CPU/CUDA/Pytorch/... difference.
    Looking forward to receiving a response, Thank you.

V100无法复现您的帧率

Originally posted by @junjiehe96 in #37 (comment)

        +---------------------------------------------------------------------------------------+

| NVIDIA-SMI 530.30.02 Driver Version: 530.30.02 CUDA Version: 12.1 |
|-----------------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+======================+======================|
| 0 Tesla V100-PCIE-32GB On | 00000000:07:00.0 Off | 0 |
| N/A 31C P0 37W / 250W| 5146MiB / 32768MiB | 4% Default |
| | | N/A |
+-----------------------------------------+----------------------+----------------------+

[04/06 17:46:17 d2.evaluation.evaluator]: Inference done 370/5000. Dataloading: 0.0022 s/iter. Inference: 0.0564 s/iter. Eval: 0.1694 s/iter. Total: 0.2281 s/iter. ETA=0:17:36
[04/06 17:46:22 d2.evaluation.evaluator]: Inference done 391/5000. Dataloading: 0.0022 s/iter. Inference: 0.0567 s/iter. Eval: 0.1699 s/iter. Total: 0.2289 s/iter. ETA=0:17:35
[04/06 17:46:27 d2.evaluation.evaluator]: Inference done 412/5000. Dataloading: 0.0022 s/iter. Inference: 0.0568 s/iter. Eval: 0.1704 s/iter. Total: 0.2296 s/iter. ETA=0:17:33
[04/06 17:46:32 d2.evaluation.evaluator]: Inference done 433/5000. Dataloading: 0.0023 s/iter. Inference: 0.0567 s/iter. Eval: 0.1712 s/iter. Total: 0.2303 s/iter. ETA=0:17:31
[04/06 17:46:37 d2.evaluation.evaluator]: Inference done 457/5000. Dataloading: 0.0022 s/iter. Inference: 0.0565 s/iter. Eval: 0.1706 s/iter. Total: 0.2295 s/iter. ETA=0:17:22
[04/06 17:46:42 d2.evaluation.evaluator]: Inference done 480/5000. Dataloading: 0.0022 s/iter. Inference: 0.0564 s/iter. Eval: 0.1704 s/iter. Total: 0.2292 s/iter. ETA=0:17:16

        请问,为什么我用V100无法复现你的帧率?完全使用fastinst_R50-vd-dcn_ppm-fpn_x3_640.yaml默认设置只有不到20帧。

请问您是通过什么方式在coco2017test-dev数据集上得到的测试结果?为什么我在coco2017testdev上eval生成的json文件特别大,并且上传coco官网评估会出错?

你好,这是一项非常棒的工作!
我正在复现你们的论文成果,但是
python train_net.py --eval-only --num-gpus 2 --config-file configs/coco/instance-segmentation/fastinst_R50_ppm-fpn_x1_576.yaml MODEL.WEIGHTS /path/to/checkpoint_file
我运行了上面的代码,权重文件是在这里下载的,测试数据集改成了coco2017test-dev,并下载了相应的标注json文件,评估结束之后在output路径下生成了对应的json文件,但是这个json文件特别大,1.1G左右,如果不使用test-dev而是test的话则有2.6G。然后我按coco官网评估要求修改了文件名称,上传json文件到服务器,都没有成功
第一次上传报错:
WARNING: Your kernel does not support swap limit capabilities or the cgroup is not mounted. Memory limited without swap.
/opt/conda/lib/python2.7/site-packages/matplotlib/font_manager.py:273: UserWarning: Matplotlib is building the font cache using fc-list. This may take a moment.
warnings.warn('Matplotlib is building the font cache using fc-list. This may take a moment.')
Traceback (most recent call last):
File "/tmp/codalab/tmpa2BTU6/run/program/run.py", line 112, in
res.extend(json.load(data_file))
File "/opt/conda/lib/python2.7/json/init.py", line 291, in load
**kw)
File "/opt/conda/lib/python2.7/json/init.py", line 339, in loads
return _default_decoder.decode(s)
File "/opt/conda/lib/python2.7/json/decoder.py", line 364, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "/opt/conda/lib/python2.7/json/decoder.py", line 380, in raw_decode
obj, end = self.scan_once(s, idx)
ValueError: Expecting , delimiter: line 1 column 48971776 (char 48971775)
第二次上传报错:
WARNING: Your kernel does not support swap limit capabilities or the cgroup is not mounted. Memory limited without swap.
/opt/conda/lib/python2.7/site-packages/matplotlib/font_manager.py:273: UserWarning: Matplotlib is building the font cache using fc-list. This may take a moment.
warnings.warn('Matplotlib is building the font cache using fc-list. This may take a moment.')
Traceback (most recent call last):
File "/tmp/codalab/tmpC6m7hj/run/program/run.py", line 120, in
cocoDt=cocoGt.loadRes(resFile)
File "/opt/conda/lib/python2.7/site-packages/pycocotools/coco.py", line 309, in loadRes
anns = json.load(open(resFile))
File "/opt/conda/lib/python2.7/json/init.py", line 287, in load
return loads(fp.read(),
MemoryError
第三次上传报错:
WARNING: Your kernel does not support swap limit capabilities or the cgroup is not mounted. Memory limited without swap.
/opt/conda/lib/python2.7/site-packages/matplotlib/font_manager.py:273: UserWarning: Matplotlib is building the font cache using fc-list. This may take a moment.
warnings.warn('Matplotlib is building the font cache using fc-list. This may take a moment.')
还请指教,谢谢!

你好,我正在尝试改进您的模型,想请问一下您的帧率是如何计算的。

比如有输出
[03/30 02:34:58 d2.evaluation.evaluator]: Inference done 1192/1250. Dataloading: 0.0016 s/iter. Inference: 0.0399 s/iter. Eval: 0.1135 s/iter. Total: 0.1550 s/iter. ETA=0:00:08
[03/30 02:35:03 d2.evaluation.evaluator]: Inference done 1227/1250. Dataloading: 0.0016 s/iter. Inference: 0.0399 s/iter. Eval: 0.1132 s/iter. Total: 0.1548 s/iter. ETA=0:00:03
[03/30 02:35:08 d2.evaluation.evaluator]: Total inference time: 0:03:13.537369 (0.155452 s / iter per device, on 4 devices)
[03/30 02:35:08 d2.evaluation.evaluator]: Total inference pure compute time: 0:00:49 (0.039901 s / iter per device, on 4 devices)

请问您计算帧率是 直接 1/ 0.039901 吗?

Model does not output bounding boxes

I am visualizing results of FastInst model using demo.py. Mask segmentation of objects are correct but all bounding boxes are in top-left corner of an image. Is this behaviour expected?

About the number of instances

There are only six categories, but the inference result shows that a total of 100 instances are detected, what is the reason?

Inference demo.py

0%| | 0/76 [00:00<?, ?it/s]
Traceback (most recent call last):
File "/data3/shenbaoyue/code/FastInst-try-maskattention/demo/demo.py", line 123, in
img = read_image(path, format="BGR")
File "/data3/shenbaoyue/code/FastInst-try-maskattention/detectron2/detectron2/data/detection_utils.py", line 180, in read_image
with PathManager.open(file_name, "rb") as f:
File "/data3/shenbaoyue/anaconda3/envs/fastry/lib/python3.8/site-packages/iopath/common/file_io.py", line 1012, in open
bret = handler._open(path, mode, buffering=buffering, **kwargs) # type: ignore
File "/data3/shenbaoyue/anaconda3/envs/fastry/lib/python3.8/site-packages/iopath/common/file_io.py", line 604, in _open
return open( # type: ignore
IsADirectoryError: [Errno 21] Is a directory: '/'

进程已结束,退出代码1

How can I resolve this error when I configure the parameters to run demo demo. py? The training process is normal. I have tried to solve it online but it has not been successful. I would greatly appreciate it if I could receive assistance

About Visualizing the results

Hi, thanks for your wonderful work, i have trained FastInst on customized datasets, and now i want to visualize the prediction results, so i used the demo.py, but i failed to visualize it. Is there an another way i can try to visualize the results?

training log file

Hi,
Thank you for your hard work on FastInst. Could you please provide me with a copy of your training log file?

Best regards,
[Xiaolin]

Not satisfied with overall results of the model.

So I was able to finish the training of the model for roughly 280,000 iterations and I am not satisified with the overall results of the model.
The overall mean AP for resnet50 with batch size set to 6 comes out to be only 27% with the AP for the smaller objects coming out to be only 9%.

Any reasons why this drop in AP is happening,is it because of the batch size .
I noticed that whenver I resume training,instead of resuming from where it left off,it starts at iteration 0.
An example of the instance segmentation on coco image is shown below
Original image
000000071074
Result image
img1

How to test models with "coco_2017_test-dev"

I download the "image_info_test-dev2017.json" in COCO. And use the "coco_2017_test-dev" as the TEST DATASET. When I run the train_net.py to evaluate the model, I can't get the accuracy and the result shows "Annotations are not available for evaluation."

If you can help me to solve the problem,I really appreciate it.

出现了个错误:An error occurred: '>' not supported between instances of 'NoneType' and 'int'

我是初学者,运行代码时出现了个错误:An error occurred: '>' not supported between instances of 'NoneType' and 'int'
下面是运行的过程

[10/22 19:13:14 detectron2]: Command line arguments: Namespace(config_file='./configs/coco/instance-segmentation/Fast-COCO-InstanceSegmentation.yaml', dist_url='tcp://127.0.0.1:49152', eval_only=False, machine_rank=0, num_gpus=1, num_machines=1, opts=[], resume=False)
[10/22 19:13:14 detectron2]: Contents of args.config_file=./configs/coco/instance-segmentation/Fast-COCO-InstanceSegmentation.yaml:
[10/22 19:13:14 detectron2]: Full config saved to ./output/config.yaml
[10/22 19:13:20 d2.engine.defaults]: Model:
[10/22 19:13:20 fastinst.data.dataset_mappers.fastinst_instance_dataset_mapper]: [FastInstInstanceDatasetMapper] Augmentations used in training: [ResizeShortestEdge(short_edge_length=(416, 448, 480, 512, 544, 576, 608, 640), max_size=853, sample_style='choice'), RandomFlip()]
[10/22 19:13:39 d2.data.datasets.coco]: Loading datasets/coco/annotations/instances_train2017.json takes 18.91 seconds.
[10/22 19:13:40 d2.data.datasets.coco]: Loaded 118287 images in COCO format from datasets/coco/annotations/instances_train2017.json
[10/22 19:13:47 d2.data.build]: Removed 1021 images with no usable annotations. 117266 images left.
[10/22 19:13:51 d2.data.build]: Distribution of instances among all 80 categories:
[10/22 19:13:51 d2.data.build]: Using training sampler TrainingSampler
[10/22 19:13:51 d2.data.common]: Serializing the dataset using: <class 'detectron2.data.common._TorchSerializedList'>
[10/22 19:13:51 d2.data.common]: Serializing 117266 elements to byte tensors and concatenating them all ...
[10/22 19:13:54 d2.data.common]: Serialized dataset takes 451.21 MiB
/
之后报错了
An error occurred: '>' not supported between instances of 'NoneType' and 'int'

试图训练时遇到问题:TypeError: __init__() got an unexpected keyword argument 'dtype'

您好,我在试图训练时遇到了如上所示错误,完整的出错信息如下:
File "train_net.py", line 416, in
launch(
File "/home/h/deep_learn/f_d/detectron2/detectron2/engine/launch.py", line 84, in launch
main_func(*args)
File "train_net.py", line 410, in main
return trainer.train()
File "/home/h/deep_learn/f_d/detectron2/detectron2/engine/defaults.py", line 486, in train
super().train(self.start_iter, self.max_iter)
File "/home/h/deep_learn/f_d/detectron2/detectron2/engine/train_loop.py", line 155, in train
self.run_step()
File "/home/h/deep_learn/f_d/detectron2/detectron2/engine/defaults.py", line 496, in run_step
self._trainer.run_step()
File "/home/h/deep_learn/f_d/detectron2/detectron2/engine/train_loop.py", line 493, in run_step
with autocast(dtype=self.precision):
TypeError: init() got an unexpected keyword argument 'dtype'
请问我该如何解决?

How to resume training from last checkpoint

Hi,I came across your work on instance segmentation and I am currently trying to reproduce the results.I was previously able to train the model for 90,000 iterations but when I tried resuming the training from the last checkpoint,I ended up getting some errors related to not properly loading the configuration file.

as i am new to detectron2,could you provide pointers on how to resume training from existing checkpoint.Does the resume option expect a cfg file as an argument or does it expect a model weights?
thanks

Not able to Classify the dataset

Hello,

I'm trying to implement this model on my own dataset which has 6 diferent types of objects and I need to perform instance segmentation on them.. I'm currently using Resnet101 architecture and SGD optimizer with LR 0.0001. Even though I trained the model for 1Lakh epoch, the model is not able to classify the images
image

Can you please let me know what might be the possible error I made while training.

Question about the experiment

In the paper, you mentioned the zero-initialized object query and the learnable object query in Table 2. IA-guided queries. Would you mind telling me how you implement the zero-initialized object query and the learnable object query?

Best regards,
xiaolin

WEIGHT_DECAY的困惑

大佬您好,我看maskformer的 WEIGHT_DECAY: 0.0001,为啥您的模型的 WEIGHT_DECAY设置的这么大呢,有什么讲究吗,恳请大佬指点

关于小目标的问题

请问作者,因为在论文里面看到有说,该分割器对小目标不太友好,是否是因为是把pixde decoder 的最后一层1/8size的feature map送入transformer deocde?如果我再把1/8size的再upsample +conv 变成1/4再送入transformer decoder这样是否能够提高分割的精度并且对小目标有一个更好的效果?即使牺牲一点推理速度也是没有关系

AttributeError: 'collections.OrderedDict' object has no attribute 'detach'

(fastinst) root@/FastInst# python tools/convert-timm-to-d2.py checkpoint_file/fastinst_R50-vd-dcn_ppm-fpn_x3_640_40.1.pth checkpoint_file/fastinst_R50-vd-dcn_ppm-fpn_x3_640_40.1.pkl
model -> backbone.model
Traceback (most recent call last):
File "tools/convert-timm-to-d2.py", line 34, in
newmodel[k] = obj.pop(old_k).detach().numpy()
AttributeError: 'collections.OrderedDict' object has no attribute 'detach'
试图使用您的工具转换模型时出现了如上错误,我该如何解决?

The gap between AP val and AP

AP val means mAP on validation set and AP means mAP on test set, right? The gap between them is large, is it normal?

关于模型的输出结果分析

你好,我在打印一张图像的检测结果时,发现输出的结果是由premask、bbox、socres、classes;其中bbox为什么输出的100x4尺寸的tensor值都是0,还有已经有了阈值限制,为什么输出的类别数都是100个

Regarding the issue of resume

What is the reason why I got stuck while reading the weights of the pre trained model for training.

[Checkpointer] Loading from “****.pth” ...

Always stuck here with this prompt

Training error

Running train_ net.py file occurred Error: TypeError :__ init__ () Got an unexpected keyword argument 'dtype', error location is detection2/engine/train_ loop.py
May I ask if anyone has encountered the same problem? How did you solve it?

关于模型的输入

你好,请问模型的输入尺寸是动态尺寸吗,还是静态定好的尺寸。尺寸的格式是怎样的

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.