xilinx / ml-suite Goto Github PK

Getting Started with Xilinx ML Suite

Home Page: https://aws.amazon.com/marketplace/pp/B077FM2JNS

License: Other

Shell 10.38% Python 6.46% CSS 0.05% HTML 0.04% JavaScript 0.63% Makefile 6.47% C 0.90% C++ 18.15% M4 0.63% CMake 0.38% Jupyter Notebook 55.92%

sdx

ml-suite's Introduction

Xilinx ML Suite v1.5

Xilinx ML Suite is now deprecated. Please use Vitis AI in the place of ML Suite and for all AI acceleration on Xilinx platforms.

ml-suite's People

Contributors

Stargazers

Watchers

Forkers

dutchalthoff lmet321 anguoyang denethor1997 maximgt alpha-source grygoryant kamranjk jiancunwang 280185386 allananask diegoosorio damontfgz xiaoqun2011 rollingstone jangkyung jxlin abr-knu quasilegendre ahn-github seanharmon haifanwen stuartalannisbet songchuanhua jelalanne shiniverse daba0uo saadmahboob zxyglx shlean anirban-ghosh zavierhan michalhusejko 11021099 ascend-hgx jchang-endiag ahoquegh darcyzhc logout82 matthewouellette nimbix joe2hpimn cnberry-zz luxingsh njubyc arunxlnx mjindal-xilinx valuefish zwj102030 winger adu81020799 hi-eeprom buaawangcui125 yuechengli phildani7 hitshark changyiyu louisliuwei x2angel aobo-xd huwanpeng haffon jebtang booool reshkor jeonggunlee jangwonpark74 fizisist haeseunglee trustcoinmining chang-steve alphabu syaffa linfp13 dandanlau zangii fivezhao rasshai srohit0 imrickysu saums templeblock dasistalex cfandy hectorkutleng danielm322 dsgarrett advitdeepak gnixihsil zolberg101 mahinlma shenchena durgabhavaniv hasansayani-xilinx yolanda-xu goparakeets21 flavio58it srkm009 anjn maxpark

ml-suite's Issues

no such file xfdnn_compiler_caffe.py or quantize.py

the command line tool demo (xfdnn_compiler_caffe_vgg16.sh) in examples directory unable to run as only .pyc is provided under xfdnn directory. Fixed replace calling xfdnn_compiler_caffe.py with xfdnn_compiler_caffe.pyc.

How to handle multiple FC layers before softmax/sigmoid?

in GoogleNet example, how to access fcWeight/fcBias for particular layer as input?
can the return data from computeFC pass on to another computeFC?
can print the data value of computeFC output?

Yolo2 demo error "AssertionError: Theshold is not a scalar"

When I run "./run.sh aws e2e" on AWS F1 with FPGA Developer AMI 1.4.0, it finishes with error as follows.

....

Processing layer 1 of 54
Layer Name:conv0 Type:Convolution
Inputs: ['data'], Outputs: ['conv0']
Quantizing conv input layer ... conv0
Traceback (most recent call last):
File "yolo.py", line 86, in
quantizer.quantize()
File "./xfdnn/tools/quantize/quantize.py", line 157, in quantize
File "./xfdnn/tools/quantize/quantize_caffe.py", line 110, in executeCalibration
File "./xfdnn/tools/quantize/quantize_caffe.py", line 225, in preProcess
File "./xfdnn/tools/quantize/quantize_base.py", line 142, in QuantizeThresholdBlob
AssertionError: Theshold is not a scalar
(ml-suite) [root@AWS]

Caffe - ERROR: Unable to create handle to FPGA

opendir: Path /sys/bus/pci/devices/0000:00:1d.0/drm does not exist or could not be read: No such file or directory
[0]user:0x1042:0x7:[xdma:2017.1.47:65535]
xclProbe found 1 FPGA slots with xocl driver running
WARNING: AwsXcl - Cannot open userPF: /dev/dri/renderD65535
WARNING: AwsXcl isGood: invalid user handle.
WARNING: xclOpen Handle check failed
[0]user:0xf010:0x1d51:[???:??:0]
device[0].user_instance : 0
WARNING: AwsXcl - Cannot open userPF: /dev/dri/renderD0
WARNING: AwsXcl isGood: invalid user handle.
ERROR: xclOpen Handle check failed
ERROR: Failed to find an OpenCL platform

Is anyone aware of the problem.

Thanks

compile error unsupported operand type(s)

Hi,

I tried to use the compiler xfdnn_compiler_caffe.pyc to compile bvlc_googlenet_without_lrn_deploy.prototxt.
It pops out

COMPUTING MEMORY REQUIREMENTS

ParametersLayer(type=[u'Pooling'], number_outputs=None, paddings=[[]], kernel_sizes=[[3L]], strides=[[2L]], dilation=None, group=None, shapes=None, sizes=None, quantizations=None, batches=None, layer_type=['layer'], extras_and_future=None, tops=[u'pool1/3x3_s2'], bottoms=[u'conv1/7x7_s2'], layer=[name: "pool1/3x3_s2"
type: "Pooling"
bottom: "conv1/7x7_s2"
top: "pool1/3x3_s2"
pooling_param {
pool: MAX
kernel_size: 3
stride: 2
}
], dag=ColorForDAG(active=[0], schedule=3, forward=[4], backward=[1, 2], extra=None, hook=[]), systolic_width=None, word_width=None, alignedsizes=None, bias=None, scale=None, preshift=None, postshift=None, batchnormalization=None, relu=None, fcmode=None, pool=0, operation=None, scaling=None, data=None, name=u'pool1/3x3_s2', input_addresses=None, output_addresses=None, data_movements=None, data_movement_costs=None, instructions=None, tf_pad=None)
Failed to complete compilation: unsupported operand type(s) for *: 'int' and 'RepeatedScalarFieldContainer'

Classification example on Oregon fails, but fine on N. Virtinia

Classification under 'ml-suite/examples' does not work at Oregon region, where AFI handling error occurs.
I tried at N. Virginia instead. (Build fresh new FPGA develop AMI).

However the ml-suite does not have overlay_0/overlay_1.xclbin for med size of kernel.
Therefore, I used 'large' instead of 'med'. And it worked. See details as follows.

Click to expand

[centos@AWS] ls
Anaconda2-5.1.0-Linux-x86_64.sh  aws-fpga/  awsver.txt	ml-suite/
[centos@AWS] 
[centos@AWS] ls
Anaconda2-5.1.0-Linux-x86_64.sh  aws-fpga/  awsver.txt	ml-suite/
[centos@AWS] sudo su
[root@ip-172-31-90-240 work.ml]# source ml-suite/overlaybins/setup.sh aws
..... messages not shown
[root@ip-172-31-90-240 work.ml]# source ~centos/.bashrc
[root@AWS] source activate ml-suite
(ml-suite) [root@AWS] source aws-fpga/sdaccel_setup.sh 
..... messages not shown
XILINX_OPENCL=/opt/Xilinx/SDx/2017.4.rte.dyn
INFO: The default AWS Platform has been set to: "AWS_PLATFORM=$AWS_PLATFORM_DYNAMIC_5_0" 
INFO: SDAccel runtime installed
INFO: SDAccel Setup PASSED
(ml-suite) [root@AWS] cd ml-suite/examples/classification
(ml-suite) [root@AWS] ./run.sh aws test_classify large 8
make: Entering directory `/home/centos/work.ml/ml-suite/apps/yolo/nms'
cd ./nms_20180209 && make
make[1]: Entering directory `/home/centos/work.ml/ml-suite/apps/yolo/nms/nms_20180209'
make[1]: Nothing to be done for `all'.
make[1]: Leaving directory `/home/centos/work.ml/ml-suite/apps/yolo/nms/nms_20180209'
make: Leaving directory `/home/centos/work.ml/ml-suite/apps/yolo/nms'
=============== pyXDNN =============================
[XBLAS] # kernels: 1
[0]user:0xf010:0x1d51:[xocl:2017.4.5:128]
xclProbe found 1 FPGA slots with xocl driver running
Linux:3.10.0-693.21.1.el7.x86_64:#1 SMP Wed Mar 7 19:03:37 UTC 2018:x86_64
Distribution: CentOS Linux release 7.4.1708 (Core) 
GLIBC: 2.17
--- 
XILINX_OPENCL="/home/centos/work.ml/ml-suite/overlaybins/aws"
LD_LIBRARY_PATH="/home/centos/work.ml/ml-suite/overlaybins/aws/runtime/lib/x86_64/:/home/centos/work.ml/ml-suite/xfdnn/rt/xdnn_cpp/build/lib:/home/centos/work.ml/ml-suite/xfdnn/rt/lib:/home/centos/work.ml/ml-suite/ext/boost/lib:/home/centos/work.ml/ml-suite/ext/zmq/libs:/home/centos/work.ml/ml-suite/examples/classification"
--- 
CL_PLATFORM_VENDOR Xilinx
CL_PLATFORM_NAME Xilinx
CL_DEVICE_0: 0x1cee080
CL_DEVICES_FOUND 1, using 0
loading /home/centos/work.ml/ml-suite/overlaybins/aws/overlay_2.xclbin
AFI not yet loaded, proceed to download.
AFI load complete.
[XBLAS] kernel0: kernelSxdnn_0
[XDNN] loading xclbin settings from /home/centos/work.ml/ml-suite/overlaybins/aws/overlay_2.xclbin.json
[XDNN] using custom DDR banks 2

[XDNN] kernel configuration
[XDNN]   num cores       : 1
[XDNN]   dsp array width : 56
[XDNN]   img mem size    : 5 MB
[XDNN]   version         : 2.2
[XDNN]   8-bit mode      : 1
[XDNN]   Max Image W/H   : 1023
[XDNN]   Max Image Depth : 4095

Loading weights/bias/quant_params to FPGA...

After FPGA (366.019964 ms)

After FC (1.812935 ms)

After Softmax (3.849030 ms)


---------- Prediction 0 for dog.jpg ----------
0.6950 - "n02112018 Pomeranian"
0.1706 - "n02123394 Persian cat"
0.0229 - "n02492035 capuchin, ringtail, Cebus capucinus"
0.0176 - "n02123597 Siamese cat, Siamese"
0.0138 - "n02085620 Chihuahua"


Success!

(ml-suite) [root@AWS]

Thank you who prepared ml-suite.

"git lfs pull" error (This repository is over its data quota)

Getting an error if try git lfs pull afterward:

$ git lfs pull
batch response: This repository is over its data quota. Purchase more data packs to restore access.
error: failed to fetch some objects from 'https://github.com/Xilinx/ml-suite.git/info/lfs'

Problem launching instance

Your instance configuration is not eligible for the free usage tier
To launch an instance that's eligible for the free usage tier, check your AMI selection, instance type, configuration options, or storage devices. Learn more about free usage tier eligibility and usage restrictions.

I have subscribed to Machine Learning Development Stack from Xilinx, Preview Edition at the $1.65 rate via AWS.
I'm not sure why I'm receiving this error.

Running a profiler through XCL bins file

I wanted to to get the power and area estimates for the XCL bins files for which I need to create a new XCL bin file. Is there any guide which would describe how to generate these files?
Regards
Adarsh

Why there is "w/" string in some document?

File "ml-suite/docs/tutorials/README.md":
xfDNN Tools
Using the xfDNN Compiler w/ a Caffe Model
Where the "w/" come from? this issue is existed in their corresponding .ipynb document.

Failure loading weights to FPGA

I've worked through #33 which was a misconfigured datadir for our custom yolo caffemodel. However, I'm still seeing an error trying to load the weights to the FPGA.

Loading weights/bias/quant_params to FPGA...
python: xdnn.cpp:4070: int XDNNFillWeightsBiasQuantBlob(short int*, int, std::string, std::string, const DType*, unsigned int, float, const DType*, unsigned int, float, short unsigned int, short unsigned int, unsigned int, unsigned int) [with DType = float; std::string = std::basic_string]: Assertion `fpgaCfgFile.empty() || qp != __null' failed.

RDP Services

Is this instance preconfigured with remote desktop protocol (RDP) services?

Low Performance for Yolo V2 while running detection on ML Suite

Hi,

I'm running the ML-Suite and YoloV2 retrained on an Amazon Instance F1 with the Xilinx Alveo AMI. I'm obtaining an average inference of 80ms/image with a quiet low accuracy (compared to 30ms/image on a GPU).

My goal is to understand why do we have this huge gap in inference time and also :

Is it a normal and expected inference time on the Alveo200 ?
If not what kind of improvements could be done ? (loading images seperatly in fpga memory for example ?)

Attached the output of the detection on a AWS F1 instance shell_results_xilinx.txt. Any help would be appreciated !

Many thanks, Cheers.

Use the compiler in the command line mode.

Hello I was trying the command provided in the tutorial
https://github.com/Xilinx/ml-suite/blob/master/docs/tutorials/compile.md

python tests/xfdnn_compiler_caffe.py -n ~centos/ml-suite/models/caffe/bvlc_googlenet_without_lrn/fp32/bvlc_googlenet_without_lrn_deploy.prototxt \

-s all
-m 4
-i 28
-g network.cmd
-w bvlc_googlenet_without_lrn.caffemodel
I am getting an error saying the caffe file dosent exist or cannot be read .Can anyone tell me what is the proper way of using the command ?

-w /wrk/acceleration/MLsuite

ml-suite/examples/compile/xfdnn_compiler_caffe_resnet.sh Ln 29,38
-w /wrk/acceleration/MLsuite/master/models/caffe/resnet/int8/resnet50_without_bn.caffemodel
it should be
-w $MLSUITE_ROOT/models/caffe/resnet/fp32/resnet50_without_bn.caffemodel \

loading AFI error

I trying to run Image Classification with Python APIs -
https://github.com/Xilinx/ml-suite/blob/master/examples/classification/README.md

In EU (Ireland) region, and got an error while loading the AFI

(ml-suite) [root@ip-xxx-xx-xx-xxx classification]# ./run.sh -p aws -t test_classify -k med -b 16
/ml-suite/overlaybins/setup.sh: line 20: /opt/xilinx/xrt/setup.sh: No such file or directory
20174
make: Entering directory `/ml-suite/apps/yolo/nms'
cd ./nms_20180209 && make
make[1]: Entering directory `/ml-suite/apps/yolo/nms/nms_20180209'
make[1]: Nothing to be done for `all'.
make[1]: Leaving directory `/ml-suite/apps/yolo/nms/nms_20180209'
make: Leaving directory `/ml-suite/apps/yolo/nms'

Running:
 Test: test_classify
 Model: googlenet_v1
 Fpgaoutsz: 1024
 Platform: aws
 Xclbin: overlay_1.xclbin
 Kernel Config: med
 Precision: 16
 Accelerator: 0

[XBLAS] # kernels: 1
[0]user:0xf010:0x1d51:[xocl:2017.4.5:128]
xclProbe found 1 FPGA slots with xocl driver running
Linux:3.10.0-693.21.1.el7.x86_64:#1 SMP Wed Mar 7 19:03:37 UTC 2018:x86_64
Distribution: CentOS Linux release 7.5.1804 (Core) 
GLIBC: 2.17
--- 
XILINX_OPENCL="/ml-suite/overlaybins/aws"
LD_LIBRARY_PATH="/ml-suite/overlaybins/aws/runtime/lib/x86_64/:/ml-suite/xfdnn/rt/xdnn_cpp/build/lib:/ml-suite/xfdnn/rt/lib:/ml-suite/ext/boost/lib:/ml-suite/ext/zmq/libs:/ml-suite/examples/classification"
--- 
CL_PLATFORM_VENDOR Xilinx
CL_PLATFORM_NAME Xilinx
CL_DEVICE_0: 0x1ff3b40
CL_DEVICES_FOUND 1, using 0
loading /ml-suite/overlaybins/aws/overlay_1.xclbin
AFI not yet loaded, proceed to download.
ERROR: Failed to create compute program from binary -44

I noticed this issue -
#22

Written in July and have the same problem, there recommends creating a new f1 instance in us-east
is this steel the issue? I prefer not to mix regions because of the many instances that run for me in Ireland

Inference multiple images

Thanks again for your great work!
I'm now trying to Inference multiple images without recreate FPGA handles or ExecData in ml-suite/xfdnn/rt/xdnn.py, so I change the script ml-suite/examples/classification/test_classify.py to something like this

#!/usr/bin/python

import os.path
import math
import sys
import timeit
import xdnn, xdnn_io
import numpy as np
import types

def main():
  args = xdnn_io.processCommandLine()
  ret = xdnn.createHandle(args['xclbin'], "kernelSxdnn_0", args['xlnxlib'])
  if ret != 0:
    sys.exit(1)
  (weightsBlob, fcWeight, fcBias ) = xdnn_io.loadWeights( args )
  for i in range(300):
    (fpgaInputs, batch_sz) = xdnn_io.prepareInput( args )
    fpgaOutput = xdnn_io.prepareOutput(args['fpgaoutsz'], batch_sz)

    startTime = timeit.default_timer()
    xdnn.execute(args['netcfg'],
      weightsBlob, fpgaInputs, fpgaOutput,
      batch_sz, # num batches
      args['quantizecfg'], args['scaleB'], args['PE'])

    elapsedTime = timeit.default_timer() - startTime
    print "\nAfter FPGA (%f ms)" % (elapsedTime*1000)

    startTime = timeit.default_timer()
    if(fcWeight is None):
      startTime = timeit.default_timer()
      softmaxOut = xdnn.computeSoftmax(fpgaOutput, batch_sz)
      elapsedTime = timeit.default_timer() - startTime
      print "\nAfter Softmax (%f ms)" % (elapsedTime*1000)
    else:
      fcOut = xdnn.computeFC(fcWeight, fcBias, fpgaOutput,
      batch_sz, args['outsz'], args['fpgaoutsz'], args['useblas'])
      elapsedTime = timeit.default_timer() - startTime
      print "\nAfter FC (%f ms)" % (elapsedTime*1000)
      #for i in range(10):
      #  print "%f" % fpgaOutput[i],
      startTime = timeit.default_timer()
      softmaxOut = xdnn.computeSoftmax(fcOut, batch_sz)
      elapsedTime = timeit.default_timer() - startTime
      #print "\nAfter Softmax (%f ms)" % (elapsedTime*1000)

    #for i in range(10):
    #  print "%f" % fpgaOutput[i],

    xdnn_io.printClassification(softmaxOut, args);
    if(i % 3 == 1):
      args["images"] = ['./flower.jpg']
    elif(i % 3 == 2):
      args["images"] = ['./cat.jpg']
    else:
      args["images"] = ['./dog.jpg']

  #print "\nSuccess!\n"
  xdnn.closeHandle()

if __name__ == '__main__':
  main()

I check fcWeight if it is None or not since there are some models that do not have fc (which is instead implemented by Conv layer).

Then use commad ./run.sh aws test_classify large 8 to run the program.
However, the for loop which run 300 iterations crashed at 256th iteration, at line xdnn.execute(), which seems that a uint8 resource isn't freed after use, and generate the error message:

python: xmlrt.cpp:106: std::pair<_cl_mem*, int> XComputeUnit::getCreateBuffer(void*, int, cl_mem_flags): Assertion `_numFreeMemSlots > 0' failed.

Then I look into ml-suite/xfdnn/rt/xdnn.py, it crash at last line of function self.execution, namely self._lib.XDNNExecute(execData._executor, inputs, outputPtr, numFpgaBatches, blocking)
, however I have no access to the source code of the dynamic linked library...
Could you please help me with this?
Or, if possible, publish source code of the library ml-suite/xfdnn/rt/xdnn_cpp/lib/libxfdnn.so please.
Thank you!

Before Posting an Issue Here

Please visit the ML Suite Forums for help, more information and or guidance with using the ML Suite:
https://forums.xilinx.com/t5/Xilinx-ML-Suite/bd-p/ML

Please use the Issue Tracker here to log new features/enhancements you are looking for.

This will keep all the knowledge in one resource for users.

Where is " xilinx trained models" for YOLOv2

According to "https://github.com/Xilinx/ml-suite/blob/master/apps/yolo/README.md",
the "xilinx trained models" should be downloaded before running yolo.
"2. Download the xilinx trained models from Xilinx.com, save as models at the root of this repo".

Where it is?

Sdaccel.setup script not working.

Hi Everyone,
I am trying to run on the AWS FPGA instance so I was running the sdaccel.setup script,The sdaccel setup script worked for me on Saturday 9/22/2018 . Today When I tried it gave me now it shows me the following error .
Following was the error:
make[2]: Leaving directory /usr/src/kernels/3.10.0-862.11.6.el7.x86_64' make -C /lib/modules/3.10.0-862.11.6.el7.x86_64/build M=/home/centos/aws-fpga/sdk/linux_kernel_drivers/xdma modules_install make[2]: Entering directory /usr/src/kernels/3.10.0-862.11.6.el7.x86_64'
INSTALL /home/centos/aws-fpga/sdk/linux_kernel_drivers/xdma/xdma.ko
Can't read private key
DEPMOD 3.10.0-862.11.6.el7.x86_64
make[2]: Leaving directory /usr/src/kernels/3.10.0-862.11.6.el7.x86_64' depmod -a install -m 644 10-xdma.rules /etc/udev/rules.d rmmod -s xdma || true modprobe xdma make[1]: Leaving directory /home/centos/aws-fpga/sdk/linux_kernel_drivers/xdma'
install -d /opt/Xilinx/SDx/2017.1.rte/runtime/platforms/xilinx_aws-vu9p-f1_4ddr-xpr-2pr_4_0/driver
install -d /opt/Xilinx/SDx/2017.1.rte/runtime/bin
install -d /opt/Xilinx/SDx/2017.1.rte/runtime/lib/x86_64
install -m 755 /home/centos/aws-fpga/SDAccel/userspace/src/libawsxcldrv.so /opt/Xilinx/SDx/2017.1.rte/runtime/platforms/xilinx_aws-vu9p-f1_4ddr-xpr-2pr_4_0/driver
install -m 755 /home/centos/aws-fpga/SDAccel/tools/awssak/xbsak /opt/Xilinx/SDx/2017.1.rte/runtime/bin
install -m 755 /opt/Xilinx/SDx/2017.4.op/runtime/bin/xclbincat /opt/Xilinx/SDx/2017.1.rte/runtime/bin
install -m 755 /opt/Xilinx/SDx/2017.4.op/runtime/bin/xclbinsplit /opt/Xilinx/SDx/2017.1.rte/runtime/bin
install -m 755 /opt/Xilinx/SDx/2017.4.op/lib/lnx64.o/libxilinxopencl.so /opt/Xilinx/SDx/2017.1.rte/runtime/lib/x86_64
install -m 755 /opt/Xilinx/SDx/2017.4.op/lib/lnx64.o/libstdc++.so* /opt/Xilinx/SDx/2017.1.rte/runtime/lib/x86_64
install: cannot stat ‘/opt/Xilinx/SDx/2017.4.op/lib/lnx64.o/libstdc++.so*’: No such file or directory
make: *** [install] Error 1.

Regards
Adarsh

Yolo script output freezes out:

I am currently running Yolo demo on the FPGA AMI.
I have followed the steps mentioned in the ml-suit setup for FPGA AMI.

git clone https://github.com/aws/aws-fpga.git
cd aws-fpga
source sdaccel_setup.sh

and then I run the ./run script. But the script freezes at the following point for which i need to give a keyboard interrupt to stop it. Following was the last screen . It just runs for single image.

Loading weights/bias/quant_params to FPGA...
Finished batch 2
INFO: Running Image(s):
INFO: ['/home/centos/ml-suite/xfdnn/tools/quantize/calibration_directory/13923040300_b4c8521b4d_z.jpg', '/home/centos/ml-suite/xfdnn/tools/quantize/calibration_directory/14931486720_37bd588ce9_z.jpg', '/home/centos/ml-suite/xfdnn/tools/quantize/calibration_directory/15439525724_97d7cc2c81_z.jpg', '/home/centos/ml-suite/xfdnn/tools/quantize/calibration_directory/16247716843_b419e8b111_z.jpg']
INFO: Preparing Input...
INFO: Running 4 image(s)
INFO:
Total FPGA: 655.149937 ms
INFO: Image Time: (163.787484 ms/img):
INFO: Running Image(s):
Finished batch 3
INFO: ['/home/centos/ml-suite/xfdnn/tools/quantize/calibration_directory/3272651417_27976a64b3_z.jpg', '/home/centos/ml-suite/xfdnn/tools/quantize/calibration_directory/3591612840_33710806df_z.jpg', '/home/centos/ml-suite/xfdnn/tools/quantize/calibration_directory/36085792773_b9a3d115a3_z.jpg', '/home/centos/ml-suite/xfdnn/tools/quantize/calibration_directory/4788821373_441cd29c9f_z.jpg']
INFO: Preparing Input...
INFO: Results for image 0: /home/centos/ml-suite/xfdnn/tools/quantize/calibration_directory/13923040300_b4c8521b4d_z.jpg
INFO: Running 4 image(s)
INFO:
Total FPGA: 290.031910 ms
INFO: Image Time: (72.507977 ms/img):
INFO: Running Image(s):
INFO: ['/home/centos/ml-suite/xfdnn/tools/quantize/calibration_directory/4814953542_de4b973dc2_z.jpg', '/home/centos/ml-suite/xfdnn/tools/quantize/calibration_directory/5904386289_924b24d75d_z.jpg', '/home/centos/ml-suite/xfdnn/tools/quantize/calibration_directory/7291910830_86a8ebb15d_z.jpg', '/home/centos/ml-suite/xfdnn/tools/quantize/calibration_directory/7647574936_ffebfa2bea_z.jpg']
INFO: Preparing Input...
INFO: Running 4 image(s)
INFO: Found 2 boxes
INFO: Obj 0: skateboard
INFO: score = 0.622609
INFO: (xlo,ylo) = (277,381)
INFO: (xhi,yhi) = (509,311)
INFO: Obj 1: person
INFO: score = 0.865611
INFO: (xlo,ylo) = (149,340)
INFO: (xhi,yhi) = (504,28)
DEBUG: STREAM 'IHDR' 16 13
DEBUG: STREAM 'IDAT' 41 1216
oimage = 13923040300_b4c8521b4d_z.jpg
Saving new image with bounding boxes drawn as out/13923040300_b4c8521b4d_z.jpg
X11 connection rejected because of wrong authentication.
QXcbConnection: Could not connect to display localhost:10.0
INFO:
Total FPGA: 290.164948 ms
INFO: Image Time: (72.541237 ms/img):
.

import CaffeFrontend failed in xyolo.py

(ml-suite) ahe@5810:~/ml-suite/apps/yolo$ grep CaffeFrontend * -R
xyolo.py:from xfdnn.tools.compile.frontends.frontend_caffe import CaffeFrontend as xfdnnCompiler
xyolo.py:from xfdnn.tools.quantize.frontends.frontend_caffe import CaffeFrontend as xfdnnQuantizer
Binary file xyolo.pyc matches
yolo.py:from xfdnn.tools.compile.bin.xfdnn_compiler_caffe import CaffeFrontend as xfdnnCompiler
yolo.py:from xfdnn.tools.quantize.quantize import CaffeFrontend as xfdnnQuantizer

Failed to parse NetParameter file in YOLOv2 Demo App

Following the YOLOv2 demo app on AWS following the instructions [1], the run bash script is failing.
./run.sh aws e2e

I'm seeing the following failure below. If I use the yolov2.caffemodel file found in the YOLOv2 zip file from the download page [2] I get past this failure. It looks like the code is expecting a caffemodel binary file instead of the format found here [3].

I0719 14:52:40.138916 4568 net.cpp:255] Network initialization done.
F0719 14:52:40.139163 4568 upgrade_proto.cpp:95] Check failed: ReadProtoFromBinaryFile(param_file, param) Failed to parse NetParameter file: /home/centos/ml-suite/models/yolov2/caffe/fp32/yolov2.caffemodel
*** Check failure stack trace: ***
./run.sh: line 56: 4568 Aborted python yolo.py

[1] https://github.com/Xilinx/ml-suite/tree/master/apps/yolo
[2] https://github.com/Xilinx/ml-suite/blob/master/docs/tutorials/models.md
[3] https://github.com/Xilinx/ml-suite/blob/8f90af4a337a049fae5074c27c7e5e60ad363fc9/models/yolov2/caffe/fp32/yolov2.caffemodel

image_classification_caffe.ipynb failed

When I run to Step 10. Execute the Fully Connected Layers on the CPU, the kernel failed.

How do I debug the problem?

Thanks

C'est fini!

The last line of "notebooks/image_classification_tensorflow.ipynb"

fpgaRT.execute hangs

Hi,

my platform is AWS, and I was going through the jupyter notebook tutorial, image_classification_tensorflow.ipynb.

I realized that the execution hangs at the following step: Step 9. Write optimized micro-code to the xDNN Processing Engine on the FPGA.

I assume it's not supposed to take minutes for running inferences (rather it should take a few milliseconds I guess).

The following is my log on the terminal session that I opened the jupyter:

xclProbe found 1 FPGA slots with xocl driver running
CL_PLATFORM_VENDOR Xilinx
CL_PLATFORM_NAME Xilinx
CL_DEVICE_0: 0x980df20
CL_DEVICES_FOUND 1, using 0
loading ../overlaybins/aws/overlay_3.xclbin
This AFI already loaded. Skip reload!
Successfully skipped reloading of local image.
[XBLAS] kernel0: kernelSxdnn_0
[XDNN] loading xclbin settings from ../overlaybins/aws/overlay_3.xclbin.json
[XDNN] using custom DDR banks 2

[XDNN] kernel configuration
[XDNN] num cores : 1
[XDNN] dsp array width : 56
[XDNN] img mem size : 5 MB
[XDNN] version : 2.3
[XDNN] 8-bit mode : 0
[XDNN] Max Image W/H : 1023
[XDNN] Max Image Depth : 4095
[XDNN] Max Filter Depth : 0

Is there anybody who has clues on this? Thanks!

Failed to find an OpenCL platform YOLOv2 Demo on AWS

I'm getting the following error running the yolo demo app on AWS with "./run.sh aws e2e". The compile and quantization steps appear to be completing successfully, xyolo is failing.

NOTE * Because of #19 I am using the yolo_deploy_608.prototxt and yolov2.caffemodel found in the zip file from the model download page, not the versions found in ml-suite/models/yolov2/caffe/fp32/

INFO: Entering XYOLO WITH
Finished batch 1
[XBLAS] # kernels: 1
[0]user:0xf010:0x1d51:[???:??:0]
xclProbe found 1 FPGA slots with xocl driver running
WARNING: AwsXcl - Cannot open userPF: /dev/dri/renderD0
WARNING: AwsXcl isGood: invalid user handle.
WARNING: xclOpen Handle check failed
[0]user:0xf010:0x1d51:[???:??:0]
device[0].user_instance : 0
WARNING: AwsXcl - Cannot open userPF: /dev/dri/renderD0
WARNING: AwsXcl isGood: invalid user handle.
ERROR: xclOpen Handle check failed
ERROR: Failed to find an OpenCL platform

Unable to open FPGA handle through the Jupyter notebooks

I was trying to run the new developer notebook which was added recently from here:
https://github.com/Xilinx/ml-suite/blob/master/notebooks/Xilinx-ML-Developer-Lab/ml-suite-developer-lab.ipynb.
I launched the notebook using the new start_ami.sh file (this was really help full though kudos for you all). But in the last step i was not able to create the FPGA handle. I know that we need sudo permission to access the FPGA thus i tried to change the script to mask off the sudo users but then the jupyter failed to load notebooks. I notice the same error when I try to launch the jupyter notebook from the ml-suite environment.
Please provide your thoughts where I might be going wrong.
@wilderfield

Running on FPGA Developer AMI

Xilinx is still rolling out an updated AWS marketplace AMI for ml-suite.

The new AMI will help developers begin quickly evaluating ml-suite without having to install anaconda, and prepare other dependencies.

Until then, you will need to evaluate with the FPGA Developer AMI. Please consider the below notes.

TEMPORARY NOTE:
If you are evaluating on AWS, the binaries we have included (overlaybins/aws/) support the latest Amazon Shell
DSA name: xilinx_aws-vu9p-f1-04261818_dynamic_5_0
The Xilinx ml-suite AMI was bundled for an older shell
For this reason, if you are starting your evaluation today, it is best to begin from the FPGA Developer AMI:
If you are using the AWS EC2 F1 FPGA DEVELOPER AMI the following steps are necessary to setup the drivers:

git clone https://github.com/aws/aws-fpga.git
cd aws-fpga
source sdaccel_setup.sh
Remember that AWS requires users to run as root to control the FPGA, so the following is necessary to use Anaconda as root:

Become root sudo su
Set Environment Variables Required by runtime source <MLSUITE_ROOT>/overlaybins/setup.sh aws
Set User Environment Variables Required to run Anaconda source ~centos/.bashrc
Activate the users Anaconda Virtual Environment = source activate ml-suite
You can avoid disk space problems on the FPGA DEVELOPER AMI by creating an instance with more than the default 70G of storage, or by resizing the /swapfile to something less than 35G.

YOLOv2 demo on AWS -- failed to find an OpenCL platform error

I’m trying to run the Yolov2 demo on a AWS F1 instance (FPGA DEV AMI version 1.4.0), but get the following error when running ./run.sh aws e2e.

INFO: Entering XYOLO WITH
Finished batch 1
[XBLAS] # kernels: 1
[0]user:0xf010:0x1d51:[???:??:0]
xclProbe found 1 FPGA slots with xocl driver running
Linux:3.10.0-862.9.1.el7.x86_64:#1 SMP Mon Jul 16 16:29:36 UTC 2018:x86_64
Distribution: CentOS Linux release 7.5.1804 (Core)
GLIBC: 2.17
XILINX_OPENCL="/home/centos/ml-suite/overlaybins/aws"
LD_LIBRARY_PATH="/home/centos/ml-suite/overlaybins/aws/runtime/lib/x86_64/:/home/centos/ml-suite/xfdnn/rt/xdnn_cpp/build/lib:/home/centos/ml-suite/xfdnn/rt/lib:/home/centos/ml-suite/ext/boost/lib:/home/centos/ml-suite/ext/zmq/libs:/home/centos/ml-suite/apps/yolo"
WARNING: AwsXcl - Cannot open userPF: /dev/dri/renderD0
WARNING: AwsXcl isGood: invalid user handle.
WARNING: xclOpen Handle check failed
[0]user:0xf010:0x1d51:[???:??:0]
device[0].user_instance : 0
WARNING: AwsXcl - Cannot open userPF: /dev/dri/renderD0
WARNING: AwsXcl isGood: invalid user handle.
ERROR: xclOpen Handle check failed
ERROR: Failed to find an OpenCL platform

I see that this issue is similar to #20. The solution mentioned therein is to ensure that sdaccel_setup.sh is sourced successfully. In my case the following are the last few output lines when sdaccel_setup.sh is sourced:

echo "XOCL_DIR: /home/centos/aws-fpga/sdk/linux_kernel_drivers/xocl"
XOCL_DIR: /home/centos/aws-fpga/sdk/linux_kernel_drivers/xocl
make -C /lib/modules/3.10.0-862.9.1.el7.x86_64/build M=/home/centos/aws-fpga/sdk/linux_kernel_drivers/xocl modules
make[2]: Entering directory /usr/src/kernels/3.10.0-862.9.1.el7.x86_64' Building modules, stage 2. MODPOST 1 modules make[2]: Leaving directory /usr/src/kernels/3.10.0-862.9.1.el7.x86_64'
make[1]: Leaving directory /home/centos/aws-fpga/sdk/linux_kernel_drivers/xocl' INFO: Installing SDAccel runtime SDK_DIR = /home/centos/aws-fpga/sdk SDACCEL_DIR = /home/centos/aws-fpga/SDAccel XILINX_SDX = /opt/Xilinx/SDx/2017.4.op INSTALL_ROOT=/opt/Xilinx/SDx/2017.4.rte.dyn DSA=xilinx_aws-vu9p-f1_dynamic_5_0 make -C /home/centos/aws-fpga/sdk/linux_kernel_drivers/xocl install make[1]: Entering directory /home/centos/aws-fpga/sdk/linux_kernel_drivers/xocl'
echo "include: -I/home/centos/aws-fpga/SDAccel/userspace/include -I/home/centos/aws-fpga/sdk/linux_kernel_drivers/xocl -I/home/centos/aws-fpga/sdk/linux_kernel_drivers/xocl/../xdma/"
include: -I/home/centos/aws-fpga/SDAccel/userspace/include -I/home/centos/aws-fpga/sdk/linux_kernel_drivers/xocl -I/home/centos/aws-fpga/sdk/linux_kernel_drivers/xocl/../xdma/
echo "sdaccel_dir: /home/centos/aws-fpga/SDAccel"
sdaccel_dir: /home/centos/aws-fpga/SDAccel
echo "ROOT: "
ROOT:
echo "XOCL_DIR: /home/centos/aws-fpga/sdk/linux_kernel_drivers/xocl"
XOCL_DIR: /home/centos/aws-fpga/sdk/linux_kernel_drivers/xocl
make -C /lib/modules/3.10.0-862.9.1.el7.x86_64/build M=/home/centos/aws-fpga/sdk/linux_kernel_drivers/xocl modules
make[2]: Entering directory /usr/src/kernels/3.10.0-862.9.1.el7.x86_64' Building modules, stage 2. MODPOST 1 modules make[2]: Leaving directory /usr/src/kernels/3.10.0-862.9.1.el7.x86_64'
make -C /lib/modules/3.10.0-862.9.1.el7.x86_64/build M=/home/centos/aws-fpga/sdk/linux_kernel_drivers/xocl modules_install
make[2]: Entering directory /usr/src/kernels/3.10.0-862.9.1.el7.x86_64' INSTALL /home/centos/aws-fpga/sdk/linux_kernel_drivers/xocl/xocl.ko Can't read private key DEPMOD 3.10.0-862.9.1.el7.x86_64 make[2]: Leaving directory /usr/src/kernels/3.10.0-862.9.1.el7.x86_64'
depmod -a
install -m 644 10-xocl.rules /etc/udev/rules.d
rmmod -s xocl || true
rmmod -s xdma || true
rmmod -s edma_drv || true
modprobe xocl
make[1]: Leaving directory `/home/centos/aws-fpga/sdk/linux_kernel_drivers/xocl'
install -d /opt/Xilinx/SDx/2017.4.rte.dyn/runtime/platforms/xilinx_aws-vu9p-f1_dynamic_5_0/driver
install -d /opt/Xilinx/SDx/2017.4.rte.dyn/runtime/bin
install -d /opt/Xilinx/SDx/2017.4.rte.dyn/runtime/lib/x86_64
install -m 755 /home/centos/aws-fpga/SDAccel/userspace/src2/libxrt-aws.so /opt/Xilinx/SDx/2017.4.rte.dyn/runtime/platforms/xilinx_aws-vu9p-f1_dynamic_5_0/driver
install -m 755 /home/centos/aws-fpga/SDAccel/tools/awssak2/xbsak /opt/Xilinx/SDx/2017.4.rte.dyn/runtime/bin
install -m 755 /opt/Xilinx/SDx/2017.4.op/runtime/bin/xclbincat /opt/Xilinx/SDx/2017.4.rte.dyn/runtime/bin
install -m 755 /opt/Xilinx/SDx/2017.4.op/runtime/bin/xclbinsplit /opt/Xilinx/SDx/2017.4.rte.dyn/runtime/bin
install -m 755 /home/centos/aws-fpga/SDAccel/aws_platform/xilinx_aws-vu9p-f1_dynamic_5_0/sw/lib/x86_64/libxilinxopencl.so /opt/Xilinx/SDx/2017.4.rte.dyn/runtime/lib/x86_64
install -m 755 /opt/Xilinx/SDx/2017.4.op/lib/lnx64.o/Default/libstdc++.so* /opt/Xilinx/SDx/2017.4.rte.dyn/runtime/lib/x86_64
Generating SDAccel F1 runtime environment setup script, /opt/Xilinx/SDx/2017.4.rte.dyn/setup.sh for bash
Generating SDAccel F1 runtime environment setup script, /opt/Xilinx/SDx/2017.4.rte.dyn/setup.csh for (t)csh
XILINX_OPENCL=/opt/Xilinx/SDx/2017.4.rte.dyn
INFO: SDAccel runtime installed
INFO: SDAccel Setup PASSED

Despite this, I keep getting the error mentioned above. I also see that the sdaccel_setup.sh script sets XILINX_OPENCL to /opt/Xilinx/SDx/2017.4.rte.dyn, However, on the ml-suite README.md, it is recommended to run the script ml-suite/overlaybins/setup.sh, which sets the same variable back to /home/centos/ml-suite/overlaybins/aws. Could this be a potential mismatch issue, that is leading to the error?

Any pointers on how to resolve this error would be appreciated.

Thanks,
Shiril

Classify multiple images

I have a server, looping over input to classify multiple images. The server will run OK for about 10-15 images then crash,

python: xmlrt.cpp:246: int XComputeUnit::computeImgDdrOffset(cl_mem): Assertion `imageMemPhys - _imgDdrBase < std::numeric_limits::max()' failed

The server will createHandle to fpga, load weights, and prepareOutput only once (as suggested in #23), then in the loop, for each image, I will call prepareInput and then execute.

Is this a cpp runtime bug, or, anything wrong with my sequence of operations?

Thank you,

VCU118 support

Hi.
Is it possible to use ml-suite with VCU118 kit(https://www.xilinx.com/products/boards-and-kits/vcu118.html)?

Quantization for networks defined in other frameworks besides Caffe

I wonder if it is possible to support quantization for other framework
And also support fpga execution(i.e. pyxfdnn.execute() in your example) without providing a quantization cfg, thus we can try model without quantization, or even quantize it during training.
Thank you!

Can't clone ml-suite (repository is over its data quota)

Attempting to clone the repo to my AWS instance fails with the following:

[centos@ip-172-31-91-124 ~]$ git clone https://github.com/Xilinx/ml-suite.git
Cloning into 'ml-suite'...
remote: Counting objects: 36628, done.
remote: Compressing objects: 100% (322/322), done.
remote: Total 36628 (delta 144), reused 263 (delta 78), pack-reused 36210
Receiving objects: 100% (36628/36628), 743.40 MiB | 39.01 MiB/s, done.
Resolving deltas: 100% (13127/13127), done.
Downloading models/caffe/aiotlabs/fp32/cifar_normalized.npz (255 MB)
Error downloading object: models/caffe/aiotlabs/fp32/cifar_normalized.npz (6b97b94): Smudge error: Error downloading models/caffe/aiotlabs/fp32/cifar_normalized.npz (6b97b94a15c87f0b9143c3c567ca208a54ebdb3adc8cc0a9e66c392332fbe3b4): batch response: This repository is over its data quota. Purchase more data packs to restore access.

Errors logged to /home/centos/ml-suite/.git/lfs/logs/20180809T132055.292163089.log
Use `git lfs logs last` to view the log.
error: external filter git-lfs smudge -- %f failed 2
error: external filter git-lfs smudge -- %f failed
fatal: models/caffe/aiotlabs/fp32/cifar_normalized.npz: smudge filter lfs failed
warning: Clone succeeded, but checkout failed.
You can inspect what was checked out with 'git status'
and retry the checkout with 'git checkout -f HEAD'

I get something similar if I try git lfs pull afterward:

[root@ip-172-31-91-124 ml-suite]# git lfs pull
batch response: This repository is over its data quota. Purchase more data packs to restore access.                                                                                                                     
error: failed to fetch some objects from 'https://github.com/Xilinx/ml-suite.git/info/lfs'

请问怎么在阿里云上部署呢？

你好，我是一位来自**的用户，使用的是**阿里云的服务器。我看你们官网上写着可以在阿里云上部署。请问怎么部署呢？

image_classification_tensorflow example - Error in deploying FPGA model (xdnn.createhandle)

Hi, I am able to compile and quantize the python tensorflow model but step 5 ,deployment fails. I am using aws ubuntu F1 instance. Is it possible to run the ml-suite without FPGA developer AMI?

import xfdnn.rt.xdnn as xdnn
ret, handles = xdnn.createHandle('../overlaybins/aws/overlay_3.xclbin')
[XBLAS] # kernels: 1
[0]user:0x1042:0x7:[???:??:0]
xclProbe found 1 FPGA slots with xocl driver running
ERROR: Load image failed.
ERROR: Sleep until load failed.
WARNING: AwsXcl - Cannot open userPF: /dev/dri/renderD0
ERROR AwsXcl: PCI kernel bar attach failed for slot# 0
WARNING: AwsXcl isGood: kernel, global & mgmt bar are: ffffffff, ffffffff, ffffffff
WARNING: xclOpen Handle check failed
[0]user:0x1042:0x7:[???:??:0]
device[0].user_instance : 0
ERROR: Load image failed.
ERROR: Sleep until load failed.
WARNING: AwsXcl - Cannot open userPF: /dev/dri/renderD0
ERROR AwsXcl: PCI kernel bar attach failed for slot# 0
WARNING: AwsXcl isGood: kernel, global & mgmt bar are: ffffffff, ffffffff, ffffffff
ERROR: xclOpen Handle check failed
ERROR: Failed to find an OpenCL platform

Video stream input to YOLO2 with ML-Suite

I wonder if the YOLO2 with ML-Suite can handle video stream input, e.g., USB camera or files.
There was a demo called 'Deep Detect' - DeepDetect Webcam Tutorial, but it is not found in current ML Suite. (https://github.com/Xilinx/ml-suite/blob/master/docs/tutorials/deepdetect_webcam.md)

Thanks. Ando.

Handling the unsupported layer

Hello,
I am trying to compile the tiny dark net using the ML-suite. I could generate the caffe models using the darknettocaffe script. But when I tried to compile the network using the compiler_caffe I got the following message:

GENERATING OUTPUT FILES

XDNN Command file: work/tinydarknet/tinyDarknet.cmds
XDNN JSON Report file: work/tinydarknet/tinyDarknet.cmds.json
OUTPUT REPORT:
Unsupported Layers: 1
0) data
Attributes: ("# LAYER data [u'Input'] ['layer']", u'data: type=Input, sizes=None, shapes=None, sched 0 Kernel None Strides None Padding None NO VALID CODE ')
Compiling weights from: ../tinydarknet/tiny.caffemodel
Writing weights to directory work/tinydarknet/tiny.caffemodel_data
SUCCES.
Looks like it cannot support first layer. But as i have seen in the other examples the first layer is always not supported and we start with conv0 layer .
Can i just ignore this error?

FPGA Developer AMI 1.5 (Please stay on 1.4)

FPGA Developer AMI 1.5 was tested with ml-suite for the first time today.

I am seeing some issue with the Xilinx runtime which comes preinstalled in the AMI.

Please hold off developing with this AMI, and use version 1.4

loading AFI error

I use the Xilinx ML Suite AMI, and I am trying to run the same demo app with the same command "./run.sh aws e2e" in this issue, I also do the source sdaccel_setup.sh after the recomanded modification. However it gives me the following error:

[XBLAS] # kernels: 1
[0]user:0xf010:0x1d51:[xocl:2017.4.5:128]
xclProbe found 1 FPGA slots with xocl driver running
CL_PLATFORM_VENDOR Xilinx
CL_PLATFORM_NAME Xilinx
CL_DEVICE_0: 0x3169fe0
CL_DEVICES_FOUND 1, using 0
loading /home/centos/src/project_data/ml-suite/overlaybins/aws/overlay_3.xclbin
AFI not yet loaded, proceed to download.
ERROR: Failed to create compute program from binary -44

I also tried the Jupyter tutorial provided by ml-suite, it also gives me the error:

loading /home/centos/src/project_data/ml-suite/overlaybins/aws/overlay_3.xclbin
AFI not yet loaded, proceed to download.
ERROR: Failed to create compute program from binary -44

Could someone help me out with this?
Sincerely thanks!
UPDATE: I use master branch.

Benchmark White Paper Link Broken

In How does FPGA compare to CPU and GPU acceleration there a See white paper for benchmark which goes to Xilinx's main page. Could someone fix this link to the correct white paper?

undefined symbol: _ZN7leveldb2DB4OpenERKNS_7OptionsERKSsPPS0_

Clean install v1.3
And after create ml-suite conda envrionment, when running python -c "import caffe" it reports the above error.
Any comments?

Broken link in the DeepDetect REST Tutorial

The link to "instructions on launching and connecting to instances" is broken.

offline application

Hi, thanks for your great contribution, I have a question:
is it possible to implement deep learning on xilinx without AWS, I mean, run all DL related application with caffe(or mxnet) ONLY on the FPGA board.
Sorry that I am not familiar with FPGA, but I have strong willing(also with good experience) on deep learning, I am C/C++ programmer.
Thank you.

Running Yolo / ML Suite with AWS F1

I spent several weeks to run YOLO / ML Suite on AWS F1 and finally I can see the results. I did not try any change on the design and I just wanted to run and see the result. The reason why it made me spend such a long time is that I was a beginner for AWS and I tried to use SDAccel (github.com/Xilinx/SDAccel_Examples) on AWS F1 at the same time, since ML Suite relies on SDAccel. Even I already have an experience with SDAccel on local machine, I had a number of errors related with AWS environment.
Anyway, I prepared a getting started yolo_on_aws_f1.pdf and hope it will help those who getting troubles while running YOLO / ML suite.

Thanks to all who gave hep on the issues I raised.

error :running the quantizer for tinydarknet

Hello,
I was trying to run the quantizer for the tiny dark net .I have following two questions
I have used the weight file generated from after I run the compiler (generated as tinyDarkoptimized ). When I run the quantizer I get following error:
Processing layer 7 of 37
Layer Name:layer5-conv Type:Convolution
Inputs: ['layer4-maxpool'], Outputs: ['layer5-conv']
Quantizing conv input layer ... layer5-conv
Threshold in shape= ()
Quantizing conv weights for layer layer5-conv...
Threshold params shape= (16,)
Min: 0 , Max: nan
Failed to quantize: range parameter must be finite.

1)I am not sure about the error and I cant examine this error as I donot have a .py file.

2)Do I need to use the weight file generated only through the quantizer or can i directly use the .caffe models which I generated through the darknettocaffe script as the weight file size is already small (around 4mb) and run inference without quantizing ?

"x264=20131218" causes error for creating conda ml-suite environment

When creating ml-suite environment of Anaconda2, '264=20131218' causes error.
Does it really necessary to install?
If so, what is correct one.

Thank you for this ML Suite.

failure for the darknet2caffe.py script for tinyDarknet

Hi, I wanted to run the tiny dark net application using the ML suite. So for the initial step I downloaded the Weights and CFG files from here .https://pjreddie.com/darknet/tiny-darknet/. When I ran the script as
python darknet2caffe.py -d "tinydark/tiny.cfg" -w "tinydark/tiny.weights" -p"prottiny" -c"cafemodeltiny"
I got the following error:
Traceback (most recent call last):
File "darknet2caffe.py", line 516, in
darknet2caffe(cfgfile, weightfile, protofile, caffemodel, arch, mergeBN)
File "darknet2caffe.py", line 84, in darknet2caffe
start = load_conv2caffe(buf, start, params[conv_layer_name])
File "darknet2caffe.py", line 120, in load_conv2caffe
conv_param[0].data[...] = np.reshape(buf[start:start+weight.size], weight.shape); start = start + weight.size
File "/home/centos/anaconda2/envs/ml-suite/lib/python2.7/site-packages/numpy/core/fromnumeric.py", line 257, in reshape
return _wrapfunc(a, 'reshape', newshape, order=order)
File "/home/centos/anaconda2/envs/ml-suite/lib/python2.7/site-packages/numpy/core/fromnumeric.py", line 52, in _wrapfunc
return getattr(obj, method)(*args, **kwds)
Has any one tried to compile protext and .caffe file for Tiny dark net.?

Gemx to ml-suite

Thanks again for this great work on FPGA!
However, I have some questions about ml-suite.
I see gemx in this repo, so I guess the xclbins are built from gemx?
But the spec in your gemx repo says that the gemx engine only support uint16, int16, short and unsigned short as its t_FloatType, so how do you achieve 8 bit (char) computation?????
Or the gemx in this repo is different from your gemx repo?
Thank you!

Failed to allocate A device memory

We're trying to run the YOLO Object detection tutorial using our own YOLO model. With our custom caffemodel and prototxt we are able to get through compilation and quantization. But then we hit the following error trying to load the weights to the FPGA. Any ideas?

Loading weights/bias/quant_params to FPGA...
ERROR: Failed to allocate A device memory
python: xblas.cpp:3202: void xblasLoadA(XBLASHandle&, int, const void*, XBLASConfig*, int): Assertion `ret' failed.