huawei-noah / bolt Goto Github PK
View Code? Open in Web Editor NEWBolt is a deep learning library with high performance and heterogeneous flexibility.
Home Page: https://huawei-noah.github.io/bolt/
License: MIT License
Bolt is a deep learning library with high performance and heterogeneous flexibility.
Home Page: https://huawei-noah.github.io/bolt/
License: MIT License
请问有验证在树莓派上可以用吗,比如树莓派4B armv7
./test_deconvolution_ocl 24 256 128 1 2 2 2 0 <
[DEBUG] thread 13883 OCLContext 0x61531c6278 constructor start
[DEBUG] thread 13883 try to dlopen libQUALCOMM_Adreno_660_map.so failed, dlopen failed: library "libQUALCOMM_Adreno_660_map.so" not found, create kernel from source code
[DEBUG] thread 13883 gcl_kernel_source 0xb40000714c3a1250 constructor
[DEBUG] thread 13883 OCLContext 0x61531c6278 constructor end
[DEBUG] thread 13883 KERNEL>>> unknow_deconv_gemm_f2s2_qc_iom_12 runInfo: ls <0 0 0> executeTime = 153.856000 us
[DEBUG] thread 13883 KERNEL>>> unknow_deconv_gemm_f2s2_qc_iom_22 runInfo: ls <0 0 0> executeTime = 130.816000 us
[DEBUG] thread 13883 KERNEL>>> unknow_deconv_gemm_f2s2_qc_iom_32 runInfo: ls <0 0 0> executeTime = 153.088000 us
[DEBUG] thread 13883 KERNEL>>> unknow_deconv_gemm_f2s2_qc_iom_42 runInfo: ls <0 0 0> executeTime = 122.880000 us
[DEBUG] thread 13883 KERNEL>>> unknow_deconv_gemm_f2s2_qc_iom_14 runInfo: ls <0 0 0> executeTime = 143.872000 us
[DEBUG] thread 13883 KERNEL>>> unknow_deconv_gemm_f2s2_qc_iom_24 runInfo: ls <0 0 0> executeTime = 102.144000 us
[DEBUG] thread 13883 KERNEL>>> unknow_deconv_gemm_f2s2_qc_iom_34 runInfo: ls <0 0 0> executeTime = 118.016000 us
[DEBUG] thread 13883 enqueue_fill_image runInfo: executeTime = 15.872000 us
[DEBUG] thread 13883 KERNEL>>> unknow_deconv_gemm_trans_fltbuf_44 runInfo: executeTime = 5.888000 us
[DEBUG] thread 13883 DATATRANS>>> enqueue_write_buffer runInfo: executeTime = 129.024000 us
[DEBUG] thread 13883 KERNEL>>> unknow_mem_trans_om_nchw_to_nchwc4 runInfo: executeTime = 113.920000 us
[INFO] thread 13883 warm up gpu:
[DEBUG] thread 13883 KERNEL>>> unknow_deconv_gemm_f2s2_qc_iom_24 runInfo: ls <0 0 0> executeTime = 102.912000 us
[DEBUG] thread 13883 KERNEL>>> unknow_deconv_gemm_f2s2_qc_iom_24 runInfo: ls <0 0 0> executeTime = 100.864000 us
[DEBUG] thread 13883 KERNEL>>> unknow_deconv_gemm_f2s2_qc_iom_24 runInfo: ls <0 0 0> executeTime = 98.048000 us
[DEBUG] thread 13883 KERNEL>>> unknow_mem_trans_im_nchwc4_to_nchw runInfo: executeTime = 51.968000 us
[DEBUG] thread 13883 DATATRANS>>> enqueue_read_buffer runInfo: executeTime = 16.896000 us
[INFO] thread 13883 16bit, Deonvolution, (1 24 256 128)+(24 1 2 2)/(2 0)=(1 1 512 256), TIME 0.098ms, GFLOPS 65.504
abs(diff) >= 1.000000e+00f, number = 23
abs(diff) >= 1.000000e-01f, number = 822
abs(diff) >= 1.000000e-02f, number = 164
abs(diff) >= 1.000000e-03f, number = 1084
abs(diff) >= 1.000000e-04f, number = 85300
abs(diff) >= 1.000000e-05f, number = 3176
abs(diff) >= 0.000000e+00f, number = 40503
maxabs = 1.530273, a = 0.000000, b = 1.530273 @ 428
maxrel = 976.562500, a = -0.000244, b = 0.000244 @ 73386
[DEBUG] thread 13883 OCLContext 0x61531c6278 deconstructor start
[DEBUG] thread 13883 gcl_kernel_source 0xb40000714c3a1250 constructor
[DEBUG] thread 13883 OCLContext 0x61531c6278 deconstructor end
我在测试bolt的opencl时发现一个bug;
由于bolt采用了NCHW / NCHWC4等数据排布混用、针对opencl在层与层之间的blob混用了buffer、image1d、image2d、image3d,同时内存分配上还采用了内存复用,可能导致了我的一个模型在depth2space_ocl层触发了一个bug,就是depth2space对应的kernel的arg里是写的是buffer的输入类型,但是内存复用以后,对应arg传入了一个image3d的数据类型,导致set_arg报错CL_INVALID_MEM_OBJ,我还在定位是内存复用的代码
In many excellent open source projects, a demo will be given as a quick start, however, we can't find a real example that can be run directly after compilation ?
An example with input data and real models may be more friendly to some beginners. Thx :)
While in release history you refer to it as 0.2.0.
I am confused.
普通模型文件中经常包含 ' . ' 等字符,X2bolt转换模型时不会转义这些字符。
然而build_preprocess_ocl.sh把GPU kernel bin导入cpp中编译时,会因为这些特殊字符导致算法文件编译失败。
https://github.com/TensorSpeech/TensorFlowTTS/blob/v1.1/tensorflow_tts/models/mb_melgan.py Use this code to train and turn to bolt ,
[ERROR] thread 37973 file /home/disk1/soft/bolt-master/model_tools/src/onnx/onnx_adaptee.h line 2128 can not process operator name:Pad__82 type:Pad attributes.
Does pad not support dynamic input? ??
axes不再是attribute而是input
$ ./install.sh --target=android-aarch64 --gpu
[ERROR] please install llvm-ranlib tools and set shell environment PATH to find it
新版本中编译过程用到的llvm-ranlib在ndkr20版本中,不存在,应当为llvm-ar,可以复制粘贴重命名一下,建议修改一下install.sh脚本
对bolt进行了benchmark测试,install 阶段也关闭了 profile功能,只看模型总耗时,发现达不到文章里提到的性能,不知道是我哪里用错了,请帮忙看一下
如图所示
文章里提到对squeezent1.1在高通888 half情况下耗时为3.949ms,我在小米11 高通888实测fp16case耗时为avg_time:7.443091ms/data;
为了验证,我实际测试了一下 https://github.com/huawei-noah/bolt/blob/master/docs/USER_HANDBOOK.md中提到的 resnet50这个网络,利用X2BOLT工具,我的命令如下./benchmark -a GPU -w 10 -l 10 -m ResNet-50_f16.bolt
高通888fp16耗时情况为
Benchmark Result:
Output Tensor prob desc: dt:DT_F16 memFormat:DF_NCHW stride(1000,1,1) offset(0,0,0) data: 0.000166 0.000330 0.000063 0.000110 0.000000 0.000508 0.000000 0.000000 sum: 0.992770
total_time:305.839355ms(loops=10)
avg_time:30.583936ms/data
min_time:29.903076ms/data
max_time:31.020020ms/data
请问一下,这里的平均耗时30.58ms性能是否正常,能否share一下 resnet50的性能耗时情况,或者提供一下resnet50_v2的模型文件(官方文章为25ms左右),交叉验证一下。
Hi, Thanks for this great work.
Is there any performance compared to oneDNN and CoreML?
build log of device BC-src-code:286:9: error: use of undeclared identifier 'in_off'
in_off += ihw_str;
^
1 diagnostic(s) generated.
Hello, compiling with your instructions (using cross compile) I face this problem. How can I solve it?
[ 74%] Linking CXX executable ../../../image/bin/test_image_processing
/home/<user>/Downloads/gcc-arm-8.3-2019.03-x86_64-aarch64-linux-gnu/bin/../lib/gcc/aarch64-linux-gnu/8.3.0/../../../../aarch64-linux-gnu/bin/ld: ../../../image/dependency/png/lib/libpng.a(png.o): Relocations in generic ELF (EM: 62)
/home/<user>/Downloads/gcc-arm-8.3-2019.03-x86_64-aarch64-linux-gnu/bin/../lib/gcc/aarch64-linux-gnu/8.3.0/../../../../aarch64-linux-gnu/bin/ld: ../../../image/dependency/png/lib/libpng.a(png.o): Relocations in generic ELF (EM: 62)
/home/<user>/Downloads/gcc-arm-8.3-2019.03-x86_64-aarch64-linux-gnu/bin/../lib/gcc/aarch64-linux-gnu/8.3.0/../../../../aarch64-linux-gnu/bin/ld: ../../../image/dependency/png/lib/libpng.a(png.o): Relocations in generic ELF (EM: 62)
/home/<user>/Downloads/gcc-arm-8.3-2019.03-x86_64-aarch64-linux-gnu/bin/../lib/gcc/aarch64-linux-gnu/8.3.0/../../../../aarch64-linux-gnu/bin/ld: ../../../image/dependency/png/lib/libpng.a: error adding symbols: file in wrong format
bolt交流群不让进吗?
目前有模型可视化工具吗?
Please add an easy-to-use benchmark tool to run arbitrary models for users to see performance on popular model like MobileNet and etc.
Bolt supports both XNOR-style and DoReFa-style BNN networks. Just save the binary weights as FP32 in an Onnx model, and X2bolt will automatically convert the storage to 1-bit representations. So far, the floating-point portion of the BNN network can only be FP16 operations, so pass "FP16" as the precision parameter to X2bolt. The number of output channels for BNN convolution layers should be divisible by 32.
这里提到的FP16是什么意思?是指对二值化网络的支持实际是用FP16实现的吗?为什么最后输出的通道数必须要被32整除呢?
Hello,
how to fix this
CANNOT LINK EXECUTABLE
errors? Not running on Kirin 980 nor 990.
1: --- Network Test (LeNet)
1: CANNOT LINK EXECUTABLE "/data/local/tmp/uldra/lenet": cannot locate symbol "Mali_G76p_bin" referenced by "/data/local/tmp/uldra/libkernelbin.so"...
1: CANNOT LINK EXECUTABLE "/data/local/tmp/uldra/lenet": cannot locate symbol "Mali_G76p_bin" referenced by "/data/local/tmp/uldra/libkernelbin.so"...
1: [ 20%] /data/local/tmp/uldra/hdr_ocl
1: [ 40%] /data/local/tmp/uldra/hdr_ocl
1: [ 60%] /data/local/tmp/uldra/hdr_ocl
1: [ 80%] /data/local/tmp/uldra/hdr_ocl
1: [100%] /data/local/tmp/uldra/hdr_ocl
1: /home/yury/source/bolt-master/tests/bin/hdr_ocl: 1 file pushed. 5.3 MB/s (324480 bytes in 0.059s)
1:
1:
1: --- GPU Network Test (HDR_OCL)
1:
1: === Input FP16
1: CANNOT LINK EXECUTABLE "/data/local/tmp/uldra/hdr_ocl": cannot locate symbol "Mali_G76p_bin" referenced by "/data/local/tmp/uldra/libkernelbin.so"...
1:
1: === Input UCHAR
1: CANNOT LINK EXECUTABLE "/data/local/tmp/uldra/hdr_ocl": cannot locate symbol "Mali_G76p_bin" referenced by "/data/local/tmp/uldra/libkernelbin.so"...
1/1 Test #1: quick_benchmark .................. Passed 7.88 sec
是否有FP16, INT8, Binary相同结构下的功耗对比分析数据?
model_tools/tools/tensorflow2caffe/tts/transform_fastspeech2.py Is the model source code open source?
Is there any document for Raspberrypi 4 ?
I have below error for cmake ..
error: ‘Factory’ was not declared in this scope
std::shared_ptr<Factory> factory;
We have sucessully build bolt inference library without model converter on Raspberry 3 model B(armv7).
export CFLAGS="-march=armv7-a -mfpu=neon-vfpv4 "
export CXXFLAGS="-march=armv7-a -mfpu=neon-vfpv4 "
./install.sh --target=linux-armv7_blank --converter=off -t 4
#benchmark
./install_linux-armv7_blank/example/benchmark -m ./kit/assets/ImageClassification/ghostnet_f32.bolt
You can transfer your bolt model to Raspberry and run inference.
The install shell script to install third party libraries mentioned in INSTALL.md is not in repo. So I have to install and configure all the dependencies manually
REAMDE的介绍上, 从REAMDE文档以及代码目录上来开, PyTorch模型有2种转换方式:
https://pytorch.org/docs/stable/jit.html?highlight=script
https://github.com/pytorch/pytorch/tree/master/torch/csrc/jit
Hello,
Could you please explain couple of things that is unclear for me. I use onnx models, and would like to use onnx2bolt
runtime. I've deployed MaskRCNN network in ONNX, and script fails.
Segfault
ROIAlign
and NMS
available to be converted from ONNX? Which ONNX opset is supported?ONNX2Bolt
runtime?bolt在华为内部的定位是咋样的呢,未来对hi芯片有没有可能支持?
./install.sh --target=android-aarch64 --mali -t 6
-- CXXFLAGS: --target=aarch64-linux-android21 -W -Wall -Wextra -O3 -fPIC -fstack-protector-all -Wno-unused-command-line-argument -Wno-unused-parameter -Wno-unused-result -Wno-deprecated-declarations -Wno-unused-variable -pthread -D_USE_JNI -D_USE_ANDROID_LOG -llog -D_USE_GENERAL -D_USE_MALI -D_USE_FP32 -D_USE_NEON -D_USE_FP16 -D_USE_F16_MIX_PRECISION -D_USE_INT8 -march=armv8-a+fp16+dotprod -D_USE_CAFFE -D_USE_ONNX -D_USE_TFLITE -D_USE_TENSORFLOW -std=c++11 -W -Wall -Wextra -O3 -fPIC -fstack-protector-all -Wno-unused-command-line-argument -Wno-unused-parameter -Wno-unused-result -Wno-deprecated-declarations -Wno-unused-variable -pthread -D_USE_JNI -D_USE_ANDROID_LOG -llog -D_USE_GENERAL -D_USE_MALI -D_USE_FP32 -D_USE_NEON -D_USE_FP16 -D_USE_F16_MIX_PRECISION -D_USE_INT8 -march=armv8-a+fp16+dotprod -D_USE_CAFFE -D_USE_ONNX -D_USE_TFLITE -D_USE_TENSORFLOW -Wl,-allow-shlib-undefined -static-libstdc++
CMake Error: The following variables are used in this project, but they are set to NOTFOUND.
Please set them or make sure they are set and tested correctly in the CMake files:
Protobuf_SHARED_LIBRARY
linked by target "model_tools_caffe" in directory /home/ubuntu/workspace/bolt/model_tools/src/caffe
linked by target "model_tools_onnx" in directory /home/ubuntu/workspace/bolt/model_tools/src/onnx
-- Configuring incomplete, errors occurred!
CMake Error at C:/Program Files/CMake/share/cmake-3.22/Modules/CMakeTestCCompiler.cmake:69 (message):
The C compiler
"D:/mingw64/bin/gcc.exe"
is not able to compile a simple test program.
It fails with the following output:
Change Dir: C:/Users/Q/Desktop/bolt-master/third_party/windows-x86_64/protobuf/protobuf-3.14.0/build/CMakeFiles/CMakeTmp
Run Build Command(s):D:/mingw64/bin/mingw32-make.exe -f Makefile cmTC_4fc87/fast && mingw32-make -f CMakeFiles\cmTC_4fc87.dir\build.make CMakeFiles/cmTC_4fc87.dir/build
mingw32-make[1]: Entering directory 'C:/Users/Q/Desktop/bolt-master/third_party/windows-x86_64/protobuf/protobuf-3.14.0/build/CMakeFiles/CMakeTmp'
Building C object CMakeFiles/cmTC_4fc87.dir/testCCompiler.c.obj
D:\mingw64\bin\gcc.exe -o CMakeFiles\cmTC_4fc87.dir\testCCompiler.c.obj -c C:\Users\Q\Desktop\bolt-master\third_party\windows-x86_64\protobuf\protobuf-3.14.0\build\CMakeFiles\CMakeTmp\testCCompiler.c
Linking C executable cmTC_4fc87.exe
"C:\Program Files\CMake\bin\cmake.exe" -E cmake_link_script CMakeFiles\cmTC_4fc87.dir\link.txt --verbose=1
"C:\Program Files\CMake\bin\cmake.exe" -E rm -f CMakeFiles\cmTC_4fc87.dir/objects.a
D:\mingw64\bin\ar.exe qc CMakeFiles\cmTC_4fc87.dir/objects.a @CMakeFiles\cmTC_4fc87.dir\objects1.rsp
D:\mingw64\bin\gcc.exe -Wl,--whole-archive CMakeFiles\cmTC_4fc87.dir/objects.a -Wl,--no-whole-archive -o cmTC_4fc87.exe -Wl,--out-implib,libcmTC_4fc87.dll.a -Wl,--major-image-version,0,--minor-image-version,0 @CMakeFiles\cmTC_4fc87.dir\linklibs.rsp
gcc.exe: error: CreateProcess: No such file or directory
mingw32-make[1]: *** [CMakeFiles\cmTC_4fc87.dir\build.make:100: cmTC_4fc87.exe] Error 1
mingw32-make[1]: Leaving directory 'C:/Users/Q/Desktop/bolt-master/third_party/windows-x86_64/protobuf/protobuf-3.14.0/build/CMakeFiles/CMakeTmp'
mingw32-make.exe: *** [Makefile:126: cmTC_4fc87/fast] Error 2
Building by command
./install.sh -t 12 -c llvm
getting this error:
- [new tag] v2.1.0-rc1 -> v2.1.0-rc1
- [new tag] v2.1.0-rc2 -> v2.1.0-rc2
- [new tag] v2.2.0-rc0 -> v2.2.0-rc0
- [new tag] v2.2.0-rc1 -> v2.2.0-rc1
From https://github.com/tensorflow/tensorflow- branch master -> FETCH_HEAD
error: Sparse checkout leaves no entry on working directory
目前在高通855平台手机上使用bolt编译生成的libOpenCL.so,在clGetPlatformIDs就会发生SIGTRAP
Hi, everyone
Could you help me to resolve an issue please
I've built bolt as it described in INSTALL.md
with Kirin 980 device plugged in.
At the end of installation I've seen:
1: Test command: /root/bolt/quick_benchmark.sh "-b" "/root/bolt/tests/bin" "-p" "/data/local/tmp/uldra" "-l" "/root/bolt/install_llvm/lib"
1: Test timeout computed to be: 10000000
1: [INFO] run test in '/root/bolt/tests/bin'
1: [INFO] test on device directory `/data/local/tmp/uldra'
1: [INFO] use library in /root/bolt/install_llvm/lib
1: /root/bolt/install_llvm/lib/libBoltModel.so: 1 file pushed. 2.5 MB/s (1067120 bytes in 0.413s)
1: /root/bolt/install_llvm/lib/libblas-enhance.so: 1 file pushed. 1.8 MB/s (57456 bytes in 0.031s)
1: /root/bolt/install_llvm/lib/libimage.so: 1 file pushed. 2.5 MB/s (149856 bytes in 0.058s)
1: /root/bolt/install_llvm/lib/libinference.so: 1 file pushed. 2.6 MB/s (682352 bytes in 0.253s)
1: /root/bolt/install_llvm/lib/libmodel-tools.so: 1 file pushed. 2.5 MB/s (246248 bytes in 0.093s)
1: /root/bolt/install_llvm/lib/libmodel-tools_caffe.so: 1 file pushed. 1.9 MB/s (1439040 bytes in 0.710s)
1: /root/bolt/install_llvm/lib/libmodel-tools_onnx.so: 1 file pushed. 2.7 MB/s (486920 bytes in 0.169s)
1: /root/bolt/install_llvm/lib/libmodel-tools_tflite.so: 1 file pushed. 3.2 MB/s (279320 bytes in 0.083s)
1: /root/bolt/install_llvm/lib/libtensor_computing.so: 1 file pushed. 2.3 MB/s (709328 bytes in 0.291s)
1: /root/bolt/tests/bin/test_mmm_int8: 1 file pushed. 2.7 MB/s (131904 bytes in 0.047s)
1: /root/bolt/tests/bin/test_mmm: 1 file pushed. 2.0 MB/s (136400 bytes in 0.066s)
1:
1: --- Matrix Matrix Multiplication
1: taskset: failed to set 25058's affinity: Invalid argument
1: taskset: failed to set 25061's affinity: Invalid argument
1: /root/bolt/tests/bin/test_convolution: 1 file pushed. 1.5 MB/s (30144 bytes in 0.019s)
1:
1: --- Conv IC=3
1: taskset: failed to set 25065's affinity: Invalid argument
1: /root/bolt/tests/bin/test_convolution_bnn: 1 file pushed. 1.5 MB/s (30152 bytes in 0.019s)
1: /root/bolt/tests/bin/test_convolution_int8: 1 file pushed. 1.7 MB/s (30232 bytes in 0.017s)
1:
1: --- Conv 5x5
1: taskset: failed to set 25070's affinity: Invalid argument
1: taskset: failed to set 25073's affinity: Invalid argument
1: taskset: failed to set 25076's affinity: Invalid argument
1:
1: --- Conv 3x3
1: taskset: failed to set 25079's affinity: Invalid argument
1: taskset: failed to set 25082's affinity: Invalid argument
1: taskset: failed to set 25085's affinity: Invalid argument
1: /root/bolt/tests/bin/test_depthwise_convolution: 1 file pushed. 1.4 MB/s (30264 bytes in 0.021s)
1:
1: --- Depthwise-Pointwise Conv
1: taskset: failed to set 25089's affinity: Invalid argument
1: /root/bolt/tests/bin/lenet: 1 file pushed. 2.1 MB/s (414384 bytes in 0.185s)
1:
1:
1: --- Network Test (LeNet)
1: taskset: failed to set 25093's affinity: Invalid argument
1: taskset: failed to set 25096's affinity: Invalid argument
1/1 Test #1: quick_benchmark .................. Passed 5.42 sec
100% tests passed, 0 tests failed out of 1
But when I try to run onnx2bolt
binary I see an error:
CANNOT LINK EXECUTABLE "./tools/onnx2bolt": library "libprotobuf.so.11" not found
There was some other error before I exported export LD_LIBRARY_PATH=/data/local/tmp/uldra
对于一个较大transformer的模型(200M)。转换时报错Segmentation fault
When I was trying to convert an onnx model of ReActNet to a bolt model, such an error occurred, how can I fix it?
Is bolt support batch inference , could I inference 2 or more sentence at the same time ?
GPU的算法文件包含algorithmMap和kernelThreadMap,当模型仅包含一些简单OP(eltwise, power等)时,不需要对tiling等参数做搜索,这时algorithmMap就是空的,kernelThreadMap中仍然包含着这些OP的local搜索结果。
因此存在一种corner case:algorithmMap.size() == 0 && kernelThreadMap.size() > 0
这时void saveMapToFile()
就会出现bug,导致这种模型的local搜索结果不会被保存到算法文件中。从而,模型下次初始化时虽然链接了这个算法文件,仍然需要重新搜索local。这时模型的第一次执行就会非常慢。具体表现是-w 0和-w 1的执行时间差异非常明显。
MINGW64 /f/bolt-master
$ ./install.sh --target=android-aarch64
[INFO] use 8 threads to parallel build third party library on windows-x86_64 for target android-aarch64 in directory /f/bolt-master/third_party/android-aarch64...
[INFO] use c language compiler /c/Users/AppData/Local/Android/Sdk/ndk/20.1.5948944/toolchains/llvm/prebuilt/windows-x86_64/bin/clang
[INFO] use c++ language compiler /c/Users/AppData/Local/Android/Sdk/ndk/20.1.5948944/toolchains/llvm/prebuilt/windows-x86_64/bin/clang++
[INFO] generate environment file to /f/bolt-master/third_party/android-aarch64.sh...
[INFO] build TFLite in /f/bolt-master/third_party/android-aarch64/tflite...
[INFO] please source /f/bolt-master/third_party/android-aarch64.sh to use...
[INFO] use /f/bolt-master/third_party/android-aarch64.sh to set environment variable...
[ERROR] TFLite not install success
Following installation guide in INSTALL.md I face this problem:
CMake Error at CMakeLists.txt:25 (install): install TARGETS given target "test_completeUT" which does not exist.
How can I fix it?
是否支持BILSTM等RNN等转换成C++
Can Bolt be compiled on Windows OS ?
We want to know what you want to add to bolt, we will evaluate your advice and make a plan to develop, so please tell us if your requirement.
ftmDataFormat for CONVOLUTION_ALGORITHM_DIRECT should be DF_NCHWN16.
https://github.com/huawei-noah/bolt/blob/master/tensor_computing/src/cpu/arm/fp16/convolution_transform_fp16.h#L194
[libprotobuf ERROR google/protobuf/text_format.cc:307] Error parsing text-format caffe.NetParameter: 80:14: Message type "caffe.EmbedParameter" has no field named "transpose".
WARNING: Logging before InitGoogleLogging() is written to STDERR
F0225 15:25:28.170408 100496 upgrade_proto.cpp:90] Check failed: ReadProtoFromTextFile(param_file, param) Failed to parse NetParameter file: tensorflow2caffe/tts/tts_encoder.prototxt
*** Check failure stack trace: ***
1.2.1替换了schema为tensorflow master版本,解析opcode出了问题。在tflite_adaptee的260行加了些日志,打印的opcode全是0:
[INFO] thread 12984 Start to convert ./xxx.tflite...
[parse_file] tfliteModel->operator_codes[0]->builtin_code : 0
[parse_file] tfliteModel->operator_codes[1]->builtin_code : 0
[parse_file] tfliteModel->operator_codes[2]->builtin_code : 0
[parse_file] tfliteModel->operator_codes[3]->builtin_code : 0
[parse_file] tfliteModel->operator_codes[4]->builtin_code : 0
[parse_file] tfliteModel->operator_codes[5]->builtin_code : 0
Segmentation fault
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.