syscv / maskfreevis Goto Github PK

Mask-Free Video Instance Segmentation [CVPR 2023]

Home Page: https://arxiv.org/abs/2303.15904

License: Apache License 2.0

Python 89.24% Shell 0.40% C++ 1.03% Cuda 9.33%

video-instance-segmentation weakly-supervised-learning weakly-supervised-segmentation annotation-efficient-learning video-segmentation instance-segmentation cvpr2023

maskfreevis's People

Contributors

Stargazers

Watchers

maskfreevis's Issues

Bounding boxes requirement during training?

Great work guys, I was wondering if you require bounding box labels during training? There’s a brief mention of this in the video and also in the paper. Just wanted to make sure.

about the radius=5

Thanks for your time. I want to know how you set the radius=5. The get_neighbor_images_patch_color_similarity(image_t, image_t+1, kernel=3, di=3) function seems to compute patch similarity of one pixel on image_t+1 to the corresponding image_t patch with radius of 3 (9 pixels). Then, during matching, the topk_mask function computes the top 5 of these 9 pixels. But according to the Implementation details, it seems like computing top 5 of 25 pixels. (radius=5, K=5).

Did you achieve a larger radius by using dilated convolution? Or the radius in your paper only means the receptive field of one patch, and only samples 9 pixels around the region for choosing topk?

Looking forward to your reply.

torchScript Export

Hi!

First of all, thanks for this awesome project!
Would it be possible to add a TorchScript export as well? This addition would be extremely helpful for utilizing it with various third-party software.

Thank you,

Paul

Not able to run make.sh

Following exactly as in here. I needed to run two additional steps

conda install -c "nvidia/label/cuda-11.7.0" cuda-toolkit
export FORCE_CUDA=1
Then I ran sh make.sh but it produced the following error

No CUDA runtime is found, using CUDA_HOME='/home/ash/anaconda3/envs/vision/'
running build
running build_py
running build_ext
/home/ash/.local/lib/python3.8/site-packages/torch/utils/cpp_extension.py:476: UserWarning: Attempted to use ninja as the BuildExtension backend but we could not find ninja.. Falling back to using the slow distutils backend.
  warnings.warn(msg.format('we could not find ninja.'))
building 'MultiScaleDeformableAttention' extension
gcc -pthread -B /home/ash/anaconda3/envs/vision/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -DWITH_CUDA -I/home/ash/Ash/repo/MaskFreeVIS/mask2former/modeling/pixel_decoder/ops/src -I/home/ash/.local/lib/python3.8/site-packages/torch/include -I/home/ash/.local/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -I/home/ash/.local/lib/python3.8/site-packages/torch/include/TH -I/home/ash/.local/lib/python3.8/site-packages/torch/include/THC -I/home/ash/anaconda3/envs/vision/include -I/home/ash/anaconda3/envs/vision/include/python3.8 -c /home/ash/Ash/repo/MaskFreeVIS/mask2former/modeling/pixel_decoder/ops/src/cpu/ms_deform_attn_cpu.cpp -o build/temp.linux-x86_64-cpython-38/home/ash/Ash/repo/MaskFreeVIS/mask2former/modeling/pixel_decoder/ops/src/cpu/ms_deform_attn_cpu.o -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -DTORCH_EXTENSION_NAME=MultiScaleDeformableAttention -D_GLIBCXX_USE_CXX11_ABI=0 -std=c++17
cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++
Traceback (most recent call last):
  File "setup.py", line 68, in <module>
    setup(
  File "/home/ash/.local/lib/python3.8/site-packages/setuptools/__init__.py", line 107, in setup
    return distutils.core.setup(**attrs)
  File "/home/ash/.local/lib/python3.8/site-packages/setuptools/_distutils/core.py", line 185, in setup
    return run_commands(dist)
  File "/home/ash/.local/lib/python3.8/site-packages/setuptools/_distutils/core.py", line 201, in run_commands
    dist.run_commands()
  File "/home/ash/.local/lib/python3.8/site-packages/setuptools/_distutils/dist.py", line 969, in run_commands
    self.run_command(cmd)
  File "/home/ash/.local/lib/python3.8/site-packages/setuptools/dist.py", line 1244, in run_command
    super().run_command(command)
  File "/home/ash/.local/lib/python3.8/site-packages/setuptools/_distutils/dist.py", line 988, in run_command
    cmd_obj.run()
  File "/home/ash/.local/lib/python3.8/site-packages/setuptools/_distutils/command/build.py", line 131, in run
    self.run_command(cmd_name)
  File "/home/ash/.local/lib/python3.8/site-packages/setuptools/_distutils/cmd.py", line 318, in run_command
    self.distribution.run_command(command)
  File "/home/ash/.local/lib/python3.8/site-packages/setuptools/dist.py", line 1244, in run_command
    super().run_command(command)
  File "/home/ash/.local/lib/python3.8/site-packages/setuptools/_distutils/dist.py", line 988, in run_command
    cmd_obj.run()
  File "/home/ash/.local/lib/python3.8/site-packages/setuptools/command/build_ext.py", line 84, in run
    _build_ext.run(self)
  File "/home/ash/anaconda3/envs/vision/lib/python3.8/site-packages/Cython/Distutils/old_build_ext.py", line 186, in run
    _build_ext.build_ext.run(self)
  File "/home/ash/.local/lib/python3.8/site-packages/setuptools/_distutils/command/build_ext.py", line 345, in run
    self.build_extensions()
  File "/home/ash/.local/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 843, in build_extensions
    build_ext.build_extensions(self)
  File "/home/ash/anaconda3/envs/vision/lib/python3.8/site-packages/Cython/Distutils/old_build_ext.py", line 195, in build_extensions
    _build_ext.build_ext.build_extensions(self)
  File "/home/ash/.local/lib/python3.8/site-packages/setuptools/_distutils/command/build_ext.py", line 467, in build_extensions
    self._build_extensions_serial()
  File "/home/ash/.local/lib/python3.8/site-packages/setuptools/_distutils/command/build_ext.py", line 493, in _build_extensions_serial
    self.build_extension(ext)
  File "/home/ash/.local/lib/python3.8/site-packages/setuptools/command/build_ext.py", line 246, in build_extension
    _build_ext.build_extension(self, ext)
  File "/home/ash/.local/lib/python3.8/site-packages/setuptools/_distutils/command/build_ext.py", line 548, in build_extension
    objects = self.compiler.compile(
  File "/home/ash/.local/lib/python3.8/site-packages/setuptools/_distutils/ccompiler.py", line 600, in compile
    self._compile(obj, src, ext, cc_args, extra_postargs, pp_opts)
  File "/home/ash/.local/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 581, in unix_wrap_single_compile
    cflags = unix_cuda_flags(cflags)
  File "/home/ash/.local/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 548, in unix_cuda_flags
    cflags + _get_cuda_arch_flags(cflags))
  File "/home/ash/.local/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1773, in _get_cuda_arch_flags
    arch_list[-1] += '+PTX'
IndexError: list index out of range

I cant find a way to fix it.

The parameter R of search radius.

Hello. I want to know whether modifying parameter R of search radius is achieved by adjusting the dilation rate of the unfold function.

The problem of training accuracy of 0

Author, Hello, Thank you for your wonderful project on VIS. I want to know why I don 't put pre-training weights when I train on maskfree. As the number of training steps increases, the accuracy becomes lower and lower, and eventually becomes 0. Do you know why ? In addition, even if the pre-training weight is added, as long as the network is slightly improved, the network training accuracy is ultimately 0 ? What is the reason, thank you !

scripts/visual_video.sh fails

With my current configuration that follows requirements, bash scripts/visual_video.sh fails with

ModuleNotFoundError: No module named 'MultiScaleDeformableAttention'
Please compile MultiScaleDeformableAttention CUDA op with the following commands:
        `cd mask2former/modeling/pixel_decoder/ops`
        `sh make.sh`

despite running make.sh produces

Installed /usr/lib/python3.8/site-packages/MultiScaleDeformableAttention-1.0-py3.8-linux-x86_64.egg
Processing dependencies for MultiScaleDeformableAttention==1.0
Finished processing dependencies for MultiScaleDeformableAttention==1.0

At the same time, inference with detectron2 python3 detectron2/demo/demo.py works as expected.

Having a reproducible configuration would hopefully eliminate the scripts/visual_video.sh failure, having a Dockerfile would be ideal.

Running Demo

Hi @lkeab ,
Thanks for your cool project on VIS.I am looking to run a simple demo on our own videos.I couldn't find the instructions in the repo. In demo video folder, there is an hyperlink to getting started.md. But, its forbidden with 404.Can you update or share me the location of readme for instructions to run a simple demo?

bugs in demo_video/demo.py

python demo_video/demo.py --config-file=configs/youtubevis_2019/video_maskformer2_R50_bs16_8ep.yaml
--video-input=video_det.avi --output=out/ --confidence-threshold=0.5

[07/26 14:17:12 detectron2]: Arguments: Namespace(confidence_threshold=0.5, config_file='configs/youtubevis_2019/video_maskformer2_R50_bs16_8ep.yaml', input=None, opts=[], output='out/', save_frames=False, video_input='video_det.avi')
[07/26 14:17:13 d2.checkpoint.detection_checkpoint]: [DetectionCheckpointer] Loading from /mnt/data1/代码/深度学习/MaskFreeVIS/model_final_r50_0466.pth ...
[07/26 14:17:13 fvcore.common.checkpoint]: [Checkpointer] Loading from /mnt/data1/代码/深度学习/MaskFreeVIS/model_final_r50_0466.pth ...
Traceback (most recent call last):
File "demo_video/demo.py", line 162, in
predictions, visualized_output = demo.run_on_video(vid_frames,args.confidence_threshold)
File "/mnt/data1/代码/深度学习/MaskFreeVIS/demo_video/predictor.py", line 46, in run_on_video
predictions = self.predictor(frames)
File "/mnt/data1/代码/深度学习/MaskFreeVIS/demo_video/predictor.py", line 111, in call
predictions = self.model([inputs])
File "/home/zhangzs36/anaconda3/envs/maskfreevis/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "/mnt/data1/代码/深度学习/MaskFreeVIS/demo_video/../mask2former_video/video_maskformer_model.py", line 291, in forward
features = self.backbone(images.tensor)
File "/home/zhangzs36/anaconda3/envs/maskfreevis/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "/mnt/data1/代码/深度学习/detectron2/detectron2/modeling/backbone/resnet.py", line 445, in forward
x = self.stem(x)
File "/home/zhangzs36/anaconda3/envs/maskfreevis/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "/mnt/data1/代码/深度学习/detectron2/detectron2/modeling/backbone/resnet.py", line 356, in forward
x = self.conv1(x)
File "/home/zhangzs36/anaconda3/envs/maskfreevis/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "/mnt/data1/代码/深度学习/detectron2/detectron2/layers/wrappers.py", line 131, in forward
x = self.norm(x)
File "/home/zhangzs36/anaconda3/envs/maskfreevis/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "/mnt/data1/代码/深度学习/detectron2/detectron2/layers/batch_norm.py", line 58, in forward
return F.batch_norm(
File "/home/zhangzs36/anaconda3/envs/maskfreevis/lib/python3.8/site-packages/torch/nn/functional.py", line 2281, in batch_norm
return torch.batch_norm(
RuntimeError: cuDNN error: CUDNN_STATUS_NOT_SUPPORTED. This error may appear if you passed in a non-contiguous input.

Package Version Editable project location

absl-py 1.4.0
antlr4-python3-runtime 4.9.3
black 23.7.0
cachetools 5.3.1
certifi 2023.7.22
charset-normalizer 3.2.0
click 8.1.6
cloudpickle 2.2.1
contourpy 1.1.0
cycler 0.11.0
Cython 3.0.0
detectron2 0.6 /mnt/data1/代码/深度学习/detectron2
filelock 3.12.2
fonttools 4.41.1
fsspec 2023.6.0
fvcore 0.1.5.post20221221
google-auth 2.22.0
google-auth-oauthlib 1.0.0
grpcio 1.56.2
h5py 3.9.0
huggingface-hub 0.16.4
hydra-core 1.3.2
idna 3.4
imageio 2.31.1
importlib-metadata 6.8.0
importlib-resources 6.0.0
iopath 0.1.9
kiwisolver 1.4.4
lazy_loader 0.3
Markdown 3.4.4
MarkupSafe 2.1.3
matplotlib 3.7.2
MultiScaleDeformableAttention 1.0
mypy-extensions 1.0.0
networkx 3.1
numpy 1.24.4
oauthlib 3.2.2
omegaconf 2.3.0
opencv-python 4.8.0.74
packaging 23.1
pathspec 0.11.1
Pillow 9.3.0
pip 23.1.2
platformdirs 3.9.1
portalocker 2.7.0
protobuf 4.23.4
pyasn1 0.5.0
pyasn1-modules 0.3.0
pycocotools 2.0.6
pyparsing 3.0.9
python-dateutil 2.8.2
PyWavelets 1.4.1
PyYAML 6.0.1
requests 2.31.0
requests-oauthlib 1.3.1
rsa 4.9
safetensors 0.3.1
scikit-image 0.21.0
scipy 1.10.1
setuptools 67.8.0
shapely 2.0.1
six 1.16.0
submitit 1.4.5
tabulate 0.9.0
tensorboard 2.13.0
tensorboard-data-server 0.7.1
termcolor 2.3.0
tifffile 2023.7.10
timm 0.9.2
tomli 2.0.1
torch 1.9.0
torchvision 0.10.0
tqdm 4.65.0
typing_extensions 4.6.3
urllib3 1.26.16
Werkzeug 2.3.6
wheel 0.38.4
yacs 0.1.8
zipp 3.16.2

Discrepancy in mAP Results between Provided Model and Self-Trained Model on YTVIS2019 Validation Set

Hello,

I've been experimenting with your VIS model and I've encountered an issue that I hope you can help me clarify.

I trained the model on a server equipped with eight V100-32GB GPUs using the mfvis_nococo/scripts/train_8gpu_mask2former_r50_video.sh script, without any modifications to the parameters. The test results I obtained on the YTVIS2019 validation set are as follows:

For comparison, I also tested the trained weight you provided on the same YTVIS2019 validation set and obtained these results:

As you can see, there is a minor difference in the mAP scores. I was unable to achieve the mAP=42.5 as reported in your documentation. Could you please provide some insight as to why this discrepancy might be occurring?

I appreciate your time and look forward to your response. Thank you.

What is the required normalization for images before model
What is the necessary post processing (if any) once the model is scored?

Many thanks for a great project!

syscv / maskfreevis Goto Github PK

maskfreevis's People

Contributors

Stargazers

Watchers

Forkers

maskfreevis's Issues

Recommend Projects

Recommend Topics

Recommend Org