Giter Club home page Giter Club logo

clip-es's People

Contributors

linyq2117 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

clip-es's Issues

How can I convert pseudo mask into annotations?

Your work is so great! I definitely appreciate it!

I meet some troubles trapping me for a long time:
After using "eval_cam_with_crf.py", I get pseudo mask, whose form is png.
But how can I use png to train the model?
I mean, is there a certain method, which can transfer the pseudo mask into voc annotation or coco annotation?
Thanks!

COCO数据集的SegmentationClass文件

你好,请问COCO数据集的SegmentationClass文件该怎么获取,我在COCO官网上并没有找到这一文件,在搜索引擎上也没有找到答案,可以告知下吗,非常感谢!

Reproducing miou 58.6 (Initial) from Table 2

I am very grateful for your solid open-source work. I've encountered a slight issue while trying to reproduce the 58.6 metric mentioned in Table 2. I directly used 'cam_to_refine' as the value for 'cam_refined' and then executed 'eval_cam.py' to evaluate the CAM (Class Activation Maps). However, the resulting mIoU I obtained is 50. Could you please help me identify what might be causing this discrepancy?

About CLIP RN-50

Thank you very much for your exciting work!

Would like to ask if you can provide the code for CLIP RN50 to generate CAMs?

I wish you good luck with your research and a happy life.

"origami" and "rendering

May I ask if you have a code for the experiment related to the selection of prompt words "origami" and "rendering"? Can you provide it? Thank you, looking forward to your reply

Clarification on Deeplab Model Configuration and Pre-training

Hi,
Thank you for sharing your research and codebase with the community. I'm currently working on replicating the results and have a query regarding the Deeplab model's setup.
In your repository, it's mentioned that your Deeplab training code is adapted from deeplab-pytorch. I wanted to clarify if you strictly adhered to the default Pascal VOC configuration settings provided in that original repository. Additionally, did you utilize the pre-trained model (pretrained on COCO) that the repo offers, or did you employ a different configuration or pre-trained models for your experiments?

Thank you for your time and assistance!

why is sinkhorn?

Hi,this is a great work,it's surprise.i have some question about it.
I want to know why you choose the sinkhorn to compute the Attention-based Affinity map ,there are many other method.
Second , maybe there is a little miss of paper : The formula (8) ,It seems that there is a lack of parentheses, which makes me confused.
Last,Do you plan to release the training code? I'm looking forward to it😄

CGL损失在什么时候使用的?

您好,我看论文中说在最终分割模型的时候使用CGL损失,但代码中分割过程用的是deeplab-v2的代码,没有加入这个损失,所以是在哪步使用了这个损失呢?另外我跑了一下源代码,生成的伪掩模是包含置信区间的边界的,是原始的伪掩模和置信图在经过交叉熵损失后直接融合成了新的伪掩模了吗?

The voc results

Hi, that is an excellent work, thanks for sharing.

I have a question about the Table 2. in your paper. When i run the code to generate the Initial CAM, Initial + CAA and Initial + CAA + dCRF, it get 49.12, 70.7, 71.1 mIoU. Is this because different settings use different thresholds?
And maybe the different way to generate the voc2012 dataset causes this problem, so could you please release how to generate the voc2012 dataset?

Thanks for your reading. Your work is interesting!

Some details

Sorry to bother you again, could you please tell me about the innovative points you proposed in which files and which part of the code?
Like:

  1. Sharpness based prompt?
  2. Synonym fusion?
  3. CAA?

Running Problems

hello, I have problems when I running the codes, just like this "RuntimeError: CUDA error: out of memory
CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.",
I have switched many torch versions, if could you tell me what is your environments in your experiments.
Thank you very much, hoping you have a good mood.

Cam performance

Thank you for your solid work. I'm reproducing your results, however, I can only get a result of 65.3 in CAM. I use an env of pytorch 1.9.0 and cuda 11.1 on a 3090. I change the file path and use the following command.

1 CUDA_VISIBLE_DEVICES=0 python generate_cams_voc12.py --split_file ./voc12/train_aug.txt --num_workers 1 --cam_out_dir ./output/voc12/cams
2 python eval_cam.py --cam_out_dir ./output/voc12/cams --cam_type attn_highres --split_file ./voc12/train.txt

and the result is :
1464 images to eval
1 0.6530047267030804
2 0.6320998620387147
{'Pixel Accuracy': 0.8778208683287688, 'Mean Accuracy': 0.8145488664308644, 'Frequency Weighted IoU': 0.7928293309400717, 'Mean IoU': 0.6530047267030804, 'Class IoU': {0: 0.8535294292729055, 1: 0.6130334632912042, 2: 0.6508222055180352, 3: 0.7070871766635738, 4: 0.5137165538355816, 5: 0.5389611068698299, 6: 0.7805507459439889, 7: 0.6934887957581246, 8: 0.7957022407450646, 9: 0.3969126198448379, 10: 0.763830780381508, 11: 0.4783550023269957, 12: 0.8016436417834419, 13: 0.7574893461824923, 14: 0.7504945026731186, 15: 0.5885225316760763, 16: 0.520919667740022, 17: 0.7986631191640343, 18: 0.5163718776232233, 19: 0.6667840730078058, 20: 0.526220380462825}}

Are there any advice to reproduce your result?

Question about the Table 1. CAM evaluation.

First of all, thanks for the nice work! I have a question about your code.
I hope the Table 1, seed and dCRF means respectively the initial+CAA and initial+CAA+dCRF CAM.
So when I try to run the code eval_cam.py and eval_cam_with_crf.py, the performance results were respectively 65.96%, 68.76%.
It has lot of gap between the paper. However, the final performance with that CAM was similar to the paper's result.

Thank you for reading!

Problem with the link for train_aug groundtruth

Hi
I hope you're doing fine.
I wanted to point out that the link to train_aug groundtruth file for running the model on PascalVoc dataset redirects to Prof. Bharath Hariharan's homepage. Could you please help me get the train_aug groundtruth file in any other way?
Thanks a lot

about the performance

Thanks for sharing the great work. I have a question for the performance.

When I runned the provided code and got pseudo-masks(CAM, etc), the performance is different from the reported.

It shows 70.71 mIoU. I think the number should not be changed because the CLIP model is fixed.
Is there any difference between the one you reported and the other you shared?

By the way, thanks for sharing the code!

eval CRF processed pseudo masks

Dear author:
评估的--cam_out_dir ./output/voc12/cams 好像是未处理过的伪标签的路径,且process函数里只有对npy文件的读取。CRF处理后的图片为png,能否提供评估png的miou的代码呢?

softmax processing on Grad CAM

Thank you for your excellent article! I would like to ask if you mentioned in the article that you performed softmax processing on Grad CAM, but I couldn't find this code. Can you provide the relevant code? Looking forward to your reply!

Clarification on COCO Dataset Segmentation Groundtruth

Hello,
I've been exploring your repo and came across the COCO dataset segmentation ground truth. I observed that it's colored, and the pixel-level labels seem to be represented in multi-channels. I was wondering:
How this colored ground-truth was generated?
How do the colors in the segmentation gt correspond to class IDs?
Also, is there code here demonstrating how to evaluate COCO segmentation using this ground truth?
Thank you!

Set of referring image segmentation queries

Thanks for your interesting work!!

I cannot get the construction details of the initial text queries for referring image segmentation.

If the detail has existed on the paper, I would be sorry to ask about it, and excuse me, please.

Best regards,

Namyup Kim.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.