Giter Club home page Giter Club logo

dynconv's People

Contributors

thomasverelst avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

dynconv's Issues

About multi-gpu training

Thanks for your awesome work! Is there any idea how multi-gpu training is supported? Because you know training ResNet-101 on ImageNet with a single GPU is unacceptably slow.

发生异常: RuntimeError CUDA error: the launch timed out and was terminated File "/home/lym/Compare experiment new/classification/main_cifar.py", line 72, in main model = net_module(sparse=args.budget >= 0, pretrained=args.pretrained).to(device=device) File "/home/lym/Compare experiment new/classification/main_cifar.py", line 232, in <module> main()

This code is buggy and the environment configuration is fine, the server is 4090,24 gigabytes of video memory。
model = net_module(sparse=args.budget >= 0, pretrained=args.pretrained).to(device=device)

question about the sparsity_target

Hello, this is brilliant work, I want to use the binary gumbel-softmax for my work. But there are some problems.
I used the soft mask for the first layer only (just apply the generated mask to the features after the first layer),and I found a strange phenomenon。The gumbel noise seemed to influence the training process too much. I plotted the sparsity loss only, and I found I usually couldn't obtain the sparsity target I set. Is this process right?
temp=5.0
微信截图_20211206151218
temp=1.0
later

Training on Google Colab

Hello,

Thank you for your effort to make your great work open source. I just wanted to ask you if you have a version of your code compatible with google colab to be able to run it without having GPU?

A question about soft-mask calculation

Wonderful job!I studied your paper and code these days, which is very enlightening to me.

I have a question about the code to calculate the soft-mask by soft = self.maskconv(x). I'm not quite sure what the reason for choosing this (conv+fc) network to calculate the soft-mask. Thank you for your kind help.

/annot/valid.json is missing

Thanks for your inspiring work on dynamic convolution. I am interested in this exciting work and try to test the efficiency of dynconv. I run the test code for pose estimation and met a exception: No such file or directory: '$mpii_root/annot/valid.json'.
I wonder how can I get this file.

BTW, when I set the stride = 2 for conv3x3_dw. I often crashed with 'THCudaCheck FAIL file=/opt/conda/conda-bld/pytorch_1565272279342/work/aten/src/THC/THCCachingHostAllocator.cpp line=296 error=77 : an illegal memory access was encountered'. Could you help me debug those issues? Thanks in advance.

Pengyu Zhang.

Mask calculation

Insightful work!!!
During the study of your paper, I have some questions (My English is not very good, and I am not aggressive, just some confusion):

  1. The first problem is Figure 2. After Sigmoid, everything should be >= 0, but the figure still use threshold 0 to make decision. From the paper, I think there should be 0.5; or No Sigmoid used.

  1. The second problem is about the code.
        if gumbel_noise:
            eps = self.eps
            U1, U2 = torch.rand_like(x), torch.rand_like(x)
            g1, g2 = -torch.log(-torch.log(U1 + eps)+eps), - \
                torch.log(-torch.log(U2 + eps)+eps)
            x = x + g1 - g2

        soft = torch.sigmoid(x / gumbel_temp)
        hard = ((soft >= 0.5).float() - soft).detach() + soft

However, the paper said, "Note that this formulation has no logarithms or exponentials in the forward pass, typically expensive computations on hardware platforms"
So in the code, why not just use soft >= 0, and no sigmoid operation.


Thanks for your kind help!

Questions about mask generation

Hi @thomasverelst

Congrats, nice work! I have two questions out of curiosity:

  1. Forward pass: Why did you choose to sample from the Bernoulli distribution instead of the Gumbel-softmax? To my knowledge, sampling from the Bernoulli distribution introduces a bias in the gradient estimation which could make optimization trickier. I understand that you would not be able to use sparse convolutions in the training but I wonder if there is another reason.

  2. Have you tried annealing the temperature parameter to less than 1?

license

Thanks for your work.
What type of license this project has? MIT, GNU or anything else?

Questions about mask usage in convolution

Hi,

Thanks for your great work!

I have some questions about the mask usage in your convolution operation. I'm wondering what is the meaning to assign conv_module.__mask__ with mask. I checked that the conv_module(x) function does not consider the conv_module.__mask__ property when operating.

def conv1x1(conv_module, x, mask, fast=False):
w = conv_module.weight.data
mask.flops_per_position += w.shape[0]*w.shape[1]
conv_module.__mask__ = mask
return conv_module(x)

Therefore, I can't get how the masks are applied in network forward propagation, such as the basicblock in

x = dynconv.conv3x3(self.conv1, x, None, mask_dilate)
x = dynconv.bn_relu(self.bn1, self.relu, x, mask_dilate)
x = dynconv.conv3x3(self.conv2, x, mask_dilate, mask)
x = dynconv.bn_relu(self.bn2, None, x, mask)
out = identity + dynconv.apply_mask(x, mask)

It seems that only the mask in dynconv.apply_mask(x, mask) works.

About the "Classification with efficient sparse MobileNetV2"

Excellent work !!!

Recently I have studied your paper and code , which is very enlightening to me. I sincerely think that your work is of great significence for us to study the dynamic convolutions. Thank you for your excellent work very much!

By the way, could you please tell me the time when the code of "Classification with efficient sparse MobileNetV2" will be published? I think it would be also a wonderful code !!!

Thank you very much!

About pose environment

Hi. Thanks for your work. I am currently running the pose demo, but I failed to build the lib folder, which shows
"make: *** No targets specified and no makefile found. Stop".
What should I do for it?

Ponder_Cost_Plotting

File "main_cifar.py", line 224, in validate
viz.plot_ponder_cost(meta['masks'])
File "/Data2/xyz/dynconv-master/classification/utils/viz.py", line 26, in plot_ponder_cost
ponder_cost = ponder_cost_map(masks)
File "/Data2/xyz/dynconv-master/classification/dynconv/utils.py", line 23, in ponder_cost_map
return out.squeeze(0).cpu().numpy()
AttributeError: 'NoneType' object has no attribute 'squeeze'

Why this is showing while plotting the ponder_cost_map?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.