CNN-Resource-Model

Modeling the logic utilization of Haddoc2 generated mappings. We rely on linear models to predict the logic resource (reported in ALMs) generated by the SCM (Multipliers) and MOA (Adders) parts. The inputs of this linear model are metrics that are computed directly from the topology and weights of a given CNN. These metrics are:

nb_null Number of null values in a 3D given convolution kernel
nb_pow2 Number of weights that are equal to a power of two in a 3D given convolution kernel. The multiplication by these weights is implemented by means of shift registers, which are less resources consuming than multipliers.
nb_bit1 Number of bits that are set to one in a given 3D convolution kernel. Intuitively, higher is this number, higher are the resource utilization.
nb_efbw: With the metrics above, we were not able to accurately predict the hardware resources, especially the adder parts, so we came up with this gem. In fact, the accumulation of partial products in Haddoc2 is achieved with a MOA that inputs multiple operands with variable bitwidths. The circuitry of such an adder has complexity that is correlated to the number of inputs, but also to the numerical dynamic of the partial sums. To illustrate this, let's consider the example of a dot-product of a vector x with a weight vector w such as w = [2 0 18 256] and let's suppose weights and inputs are represented in an 8 bits fixed point format.
- The multiplication by the first coefficient can be implemented with a shift register and the resulting partial product p[0] = x[0] * w[0] requires 8+mcl(2) = 9 bits to be represented, where mcl(x) = max(ceil(log2(x)))
- The multiplication by the second coefficient is skipped and does not generate any partial product.
- The multiplication by the third coefficient requires 8+mcl(18) = 13 bits to be represented.
- The multiplication by the last coefficient is implemented by means of shift register and the partial product requires 8+mcl(256) = 16 bits to be represented.
- Finally, the accumulation of these partial terms is achieved with a MOA that inputs respectively 9, 13 and 16 bits. The circuitry of this adder has thus a complexity that is correlated to the number of partial products and their numerical dynamic, which in turn is related to the numerical dynamic of the 3D convolution kernel weights. The nb_efbw of a given kernel is defined as: nb_efbw = sum(bw_in + mcl(bw_theta)).
We found that this nb_efbw metric is the most pertinent to model the hardware resource models, as shown in the following table, where R_squared scores of the models with different features are reported. The GLM stantds for the Generalized Linear Model in which all the four previous features are associated to model the resource usage.

MOA	Alexnet	Squeezenet	Alexnet-Comp.
nb_null	0.7345
nb_pow2	0.3722		0.3851
nb_bit1	0.6589	0.5779	0.6744
nb_efbw	0.7759	0.7109	0.7784
GLM	0.8139	0.7372	0.8105

SCM	Alexnet	Squeezenet	Alexnet-Comp.
nb_null	0.7345
nb_pow2	0.2230		0.2250
nb_bit1	0.5884	0.5070	0.6098
nb_efbw	0.7262	0.5906	0.7481
GLM	0.8010	0.6902	0.8328

kamelabdelouahab / cnn-resource-model Goto Github PK

cnn-resource-model's Introduction

CNN-Resource-Model

cnn-resource-model's People

Stargazers

Watchers

Forkers

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent