I have 2 models one is baseline keras model and its equivalent keras model, the models

Hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Thanks <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-u

Difference between Qkeras model and Keras model about qkeras HOT 9 OPEN

sandeep1404 commented on May 27, 2024

Difference between Qkeras model and Keras model

from qkeras.

Comments (9)

jurevreca12 commented on May 27, 2024

QKeras is a library for Quantization-Aware Training. It still uses float parameters and activation while training, it just simulates quantizing a tensor by limiting the float tensor to a set of values representable with fixed-point parameters. That is why you will not see a difference in size of the model when saved as .h5, and also why training a model in QKeras is slower then training a normal model. QKeras does not currently provide a method of deploying these models to the CPU, where it would use actually n-bits parameters. There are however tools such as hls4ml that can deploy such a model to an FPGA circuit.

from qkeras.

sandeep1404 commented on May 27, 2024

Hi @jurevreca12 Thanks for your reply, so to see the actual difference between qkeras model and baseline keras model which is not quantized in terms of inference time and model size(utilization) I need to port the model on FPGA using the hls4ml tool am I right? Is there any other way to notice the inference time and utilization at the software level without actually porting on FPGA, like I mean any memory profiling tools that can able to tell the difference between the quantized model and the unquantized model in terms of inference time and resource utilisation?

from qkeras.

jurevreca12 commented on May 27, 2024

QKeras can perform quantization to n-bits. This doesn't work well with processors, which typically have ALUs that support 8-bit, 16-bit, 32-bit operations.. Now. That doesn't mean that you can't run models quantized for example 3-bits ona CPU, but its really not straight forward. If you are looking to run your model on a CPU, I suggest you use another quantization library (i.e. https://www.tensorflow.org/model_optimization/guide/quantization/training). Those will quantize to 8-bits, but you will be able to deploy the model on the CPU.
In FPGAs you can make the ALU any size you want, that is why its a good fit with QKeras.

from qkeras.

sandeep1404 commented on May 27, 2024

@jurevreca12 Thanks for your answer, I have one more query, I actually want to train a binary neural network(BNN) with one-bit weights and activations using qkeras, so I trained a bnn model, now if I want to observe inference time and model size I cannot observe it in CPU right am I correct, since the CPU architecture is not designed for 1bit precision, please correct me if I am wrong. So in order to observe inference time for the bnn model or any other bit precision like ternary(3bit), or 6bit precision which is not flexible with CPU architecture, I need to port the model on FPGA using hls4ml and then observe the model size and inference time right?

from qkeras.

jurevreca12 commented on May 27, 2024

You can deploy binarized neural networks to a cpu, but qkeras doesn't allow you to directly generate an implementation for CPU (Binarized networks will use the XNOR operation, which CPUs do support.). For binarized neural networks you can also train a network with the Larq library. And I believe they even have a CPU deployment engine, all though I am not sure if it is openly available (https://docs.larq.dev/compute-engine/).

from qkeras.

sandeep1404 commented on May 27, 2024

Thanks @jurevreca12, Thank you for answering all my queries patiently, earlier I have worked on larq platform for training BNN but they are not suitable to port it on FPGA using hls4ml, since hls4ml doesn't support larq. So if I want to have a comparative analysis of my model in terms of model size and inference for different precision varying from 32 bit(baseline), 16bit, 8 bit,6bit,4bit,3bit,2bit and 1bit using Qkeras, I cannot do it on CPU, the only was is to port the models on FPGA using hls4ml tool flow, am I correct, please correct me if I am wrong, or is there any other way that you can suggest where I can get comparative analysis of the models in terms of model size and inference for different bit precision without porting it on hardware(FPGA).

from qkeras.

Prince5867 commented on May 27, 2024

Hello, I would like to ask you what is your version of keras and Qkeras, I am using the latest version of tf2.11 there is an incompatibility problem

from qkeras.

sandeep1404 commented on May 27, 2024

Hi, I am using Qkeras version ==0.9.0 and TensorFlow version of ==2.12.0, I didn't face any incompatibility problem. If there is anything please let us know.

from qkeras.

Prince5867 commented on May 27, 2024

Hi, I am using Qkeras version ==0.9.0 and TensorFlow version of ==2.12.0, I didn't face any incompatibility problem. If there is anything please let us know.

Strange to say, I was using TF version 2.11.2 at the time, and the problem when installing Qkeras would uninstall my keras first, but now I try to use your TF version and the problem disappears

from qkeras.

Difference between Qkeras model and Keras model about qkeras HOT 9 OPEN

Comments (9)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent