Comments (6)
Hi,
thanks for you comments!
- You're right, that's a mistake, should indeed by 0.5 instead of 0. Thanks for pointing out.
- The snippet you posted is the code used during training (and the code part of Fig. 2). During inference, no sigmoid is used and the threshold with 0 is used instead (therefore the mistake in Fig. 2). Thats the early return in the Gumbel forward:
dynconv/classification/dynconv/maskunit.py
Lines 69 to 71 in be1024c
from dynconv.
I got it. Thanks for your reply.
By the way, If training on 1080Ti, 2080Ti or V100, why adopt inference comparison on one 1050Ti?
I am a little curious. because, recently, it is difficult to find an 1050Ti GPU on deep learning field. or this is an example for low-computation devices like mobile phone?
from dynconv.
I only have 1050 Ti in my working machine (The more powerful GPUs are in the servers). I also intended this method for low-computation devices indeed (mobile or laptops). I don't think it makes much sense to use it on very powerful GPUs, since overhead becomes a much more important factor there in order to fully utilize the GPUs. If I ever get my hands on a NVIDIA Jetson, I'd like to check the performance there.
from dynconv.
Thanks for your patient reply.
in your reply,
"I don't think it makes much sense to use it on very powerful GPUs, since overhead becomes a much more important factor there in order to fully utilize the GPUs."
which means if we infer on 2080Ti or V100, we will not get as high speedup ratio as that on 1050Ti (60% speedup) ?
from dynconv.
which means if we infer on 2080Ti or V100, we will not get as high speedup ratio as that on 1050Ti (60% speedup) ?
I tried it now on a 1080 Ti and with larger batchsize (128) it seems ok. But still, this work is experimental and limited to depthwise convolutions for now (e.g. as in MobileNetV2). In practice, the accuracy-speed ratio of MobileNetV2 on powerful GPUs is barely better than a standard ResNet. Also, this work is not compatible with TensorRT and that would probably give better/more consistent speedup. So this is more a proof of concept than a production-ready work, ideally it'd need to be better integrated into low-level CUDA libraries for better support.
(command used: python -O tools/speedtest.py --cfg experiments/4stack/s025.yaml TEST.BATCH_SIZE_PER_GPU 128
)
from dynconv.
Great work!!! your patient reply helps me a lot to understand your paper and novel idea!
with a more powerful 1080Ti, the results shows 60% speedup (batch 32) and 96% speedup (batch 128) respectively. so can we say it also makes sense on powerful gpu?
another question is about the table results, baseline row, for 1080Ti, both batch 32 and 128 obtain 100 images/second?
one filed I think this novel idea can be used is what you said on "Conclusion and future work" part, high-resolution images might be much faster.
from dynconv.
Related Issues (15)
- A question about soft-mask calculation HOT 2
- question about the sparsity_target HOT 2
- About pose environment HOT 1
- Questions about mask generation HOT 2
- Questions about mask usage in convolution
- 发生异常: RuntimeError CUDA error: the launch timed out and was terminated File "/home/lym/Compare experiment new/classification/main_cifar.py", line 72, in main model = net_module(sparse=args.budget >= 0, pretrained=args.pretrained).to(device=device) File "/home/lym/Compare experiment new/classification/main_cifar.py", line 232, in <module> main()
- Issue with Preprocessing MPII Dataset for Pose Estimation Project
- About multi-gpu training HOT 5
- Does the cuda version support standard convolutions? HOT 2
- About the "Classification with efficient sparse MobileNetV2" HOT 2
- Ponder_Cost_Plotting
- Training on Google Colab HOT 1
- /annot/valid.json is missing HOT 3
- license HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from dynconv.