Comments (12)
@simonmaurer That is correct.
from bmxnet-v2.
@simonmaurer Sorry for the long wait on the reply: the conversion and execution with C++ API works for our tested models now, but we still have a little bit of cleaning up to do regarding building and CI. Good news is we also upgraded the underlying MXNet to 1.4.0 and we should be able to make the release this or next week.
from bmxnet-v2.
@simonmaurer Just letting you know, that BMXNet with our converter is now available. If you want to use it, please look at the Example/Test, especially the dummy forward pass before training (otherwise the model needs additional changes, by retraining the BatchNorm layers).
from bmxnet-v2.
- QActivation makes the input binary and always needs to be before a QConvolution (unless the input is already binary for some reason). Also since they belong together so closely we also added a
BinaryConvolution
block, and for easier parameterization (e.g. clip_threshold, scaling methods, ...), addedactivated_conf
which uses a previously stored configuration to create suchBinaryConvolution
blocks).qconv_kwargs
is just for testing different configurations of the binary convolution (with and without padding). - As you like, so far we mostly use it as a standalone tool, I only added it for the test case (basically all lines after 62 are just for testing purposes).
- We have not yet implemented a complete example with C++ for this new version, but conversion to float32 would be the way to go.
- The model converter currently needs to be used to get the faster inference (note: it replaces the layers for training with those optimized for inference and also compresses and transforms the weights). However you can load the deployment model in Python with a SymbolBlock (this is basically done in the test case).
- Basically the default way to C++ inference in mxnet should still apply to our framework, except of course you need to load the converted binarized model (not yet tested - if you encounter problems, please create issues as needed).
from bmxnet-v2.
After we hybridize to a Symbol, we can do the inference also with C++, since we can export it to the usual symbol.json
and .param
files. We implemented the necessary functions in C++, but in a more modular way (instead of using a whole C++ QConvolution operator, we now use the normal convolution but do the functions needed for binarization before/after the default convolution operator). This is also visible in the symbolic graph now. For example, it contains the det_sign
functions as additional ops when directly exporting (you could quickly test this with the mnist example).
As for the conversion script, we are currently working on this, but it is not yet finished. It will remove unnecessary operators from the symbol.json
and convert/compress the param
file similar to the previous version.
from bmxnet-v2.
Also, we are implementing a different custom operator, which allows for the fast inference again (but independent of the HybridBlocks used for training). This operator is going to replace our Gluon convolution blocks (during our conversion script).
from bmxnet-v2.
@Jopyth ok, thanks. in other words for fast inference (that is the custom implementation of gemm kernels as found in https://github.com/hpi-xnor/BMXNet/tree/master/smd_hpi/src) you are still in the process of rewriting that part?
for now the binary weights are still treated and saved as float32 throughout the Gluon code and the code for approximated multiplications (using XNOR and bitcount operations) is yet to be reimplemented from BMXNet v1 - is that what your comment
We do not yet support deployment and inference with binary operations and models (please use the first version of BMXNet instead if you need this).
in the ReadMe refers to?
from bmxnet-v2.
@Jopyth overall great job and findings in your paper. I am really interested in your work/BMXNetv1 and for realtime applications I'd like to dig into binarized networks and timing analysis (which is why I'm so eager to be able to run it in C++ including faster inference ;) )
any news regarding conversion script?
also could you elaborate a bit on what is actually happening during the conversion script - I still dont quite get the point why you need to convert the symbol.json and param file when you already have implemented the underlying C/C++ operators (or is the C++ API using different operators? - might be the reason why even vanilla MXNet 1.4.0 still doesnt support reduced precision ie. float16 in C++ API)
maybe because you created custom operators but only in Python?
from bmxnet-v2.
Basically we need the conversion script for two reasons: the first one is the same reason as in the first BMXNet (we need to compress the binary weights with bit-packing). The second one is the one you mentioned: We use different operators between the training with Python and inference with C++. previously we had the functionality for training and inference (sped up on CPU) in the same layer and chose which version to execute based on inference setting and device. Now we have split up training and inference: training is done with multiple layers (in Gluon/Hybrid mode) but during inference we only use our one layer our (sped-up) custom convolution.
from bmxnet-v2.
@Jopyth thanks a lot for pointing that out. looking forward to this useful addition and the upgrade to 1.4.0 - very nice!
also there's an interesting discussion regarding C vs C++ API in the official MXNet github repo. C++ API is just a frontend implementation just like Python but according to the discussion its missing some modules to make use of the fast float16 inference, see. https://github.com/apache/incubator-mxnet/issues/14159#issuecomment-483883108.
so <mxnet/c_predict_api.h>
referes to the C API that is able to do the fast inference whereas this is not yet true for C++ API <mxnet-cpp/MxNetCpp.h>
from bmxnet-v2.
@Jopyth that is great! also noteworthy that you keep things updated (ie. MXNet 1.4.1) - very appreciated
closing questions I still have:
- when you build your models - why does the QActivation come before QConvolution ? is it a special case that you use
**qconv_kwargs
in QConv2D - maybe for debugging purposes as used in the code ? - you mentioned Example/Text:
do we just convert the model by using subprocess inside Python code (model conversion is done transparently withexport
when using QActivation/QConv2D/QDense
->
output = subprocess.check_output(["build/tools/binary_converter/model-converter", param_file])
or use Binary converter as a standalone tool ? - how do you handle your input matrices/images(Python AND C++) ? keeping them as NDArray uint8 from OpenCV(or equivalent) or conversion to float32/float16..?
- the fast inference (backend operators with fast GEMM) is also used when we deploy hybridized models with Python ? or only if we use a model as output from the new converter?
- we never talked about this: a hint on how one can correctly load the converted model in C/C++, ie. which API to use for fast inference ?
from bmxnet-v2.
alright, pretty enlightening!
- thanks for pointing it out. am pretty to used to introducing non-linearities after linear combinations. does that also mean that if I have multiple QConvolutions I actually wouldnt need an activation layer anymore in front because the output of the preceeding layer (say QConv2D and QDense) is already binarized?
- so you tested the converted model with faster inference in Python I guess? will gladly provide you with information regarding C inference. not sure yet if the C++ API (which is also only a wrapper) will work..
from bmxnet-v2.
Related Issues (14)
- Question about MeliusNet paper HOT 2
- Can anyone convert the model into PyTorch version?
- Code of MeliusNet HOT 1
- Weights Quantization
- What's your training method of BNN HOT 1
- QDense inference HOT 2
- Question about BatchNorm HOT 1
- RuntimeError: Cannot find the MXNet library HOT 3
- Query about the new update about inference on CPU and GPU. HOT 4
- 999
- Build instructions macOS: error with -std=c++11 flag HOT 6
- Detail of BinaryDenseNet or BinaryResNet18E HOT 12
- Regarding pretrained models
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from bmxnet-v2.