Comments (8)
可尝试使用taskset绑定测试程序到CPU大核调度,对小模型CPU调度和驱动开销无法被GPU kernel耗时掩盖,影响会比较明显
from bolt.
@yunfanxiao 我也是在小米11上,测试了高通官方最新的adreno_opencl_ml_sdk_v2.1.zip,小米11上调用clQueryMLInterfaceVersionsQCOM接口会返回CL_OUT_OF_HOST_MEMORY的报错,但是高通官方说660以上是支持cl_qcom_ml_ops扩展的,请教一下bolt是怎么测试出adreno_opencl_ml_sdk的性能的
from bolt.
bolt目前没有集成qcom的ml sdk,这个问题可以去qcom官方问下
from bolt.
嗷嗷,我也是看到知乎上bolt的一个大佬提及qcom的adreno_opencl_ml_sdk实测性能可以达到1.5T,所以想请教怎么解决
from bolt.
嗷嗷,我也是看到知乎上bolt的一个大佬提及qcom的adreno_opencl_ml_sdk实测性能可以达到1.5T,所以想请教怎么解决
qcom adreno clml sdk接口虽然集成在libOpenCL.so,但是它依赖的vendor库默认没有开放权限给第三方app,第三方开发者调用时会返回-6,相应的,logcat中也会发现selinux audit日志。如果有root过的888或者8Gen1设备,可以setenforce 0测试。
from bolt.
@chillingche 多谢,得找个root的设备试试;权限问题的话是手机厂商没有开放是吧,虽然高通开放了
from bolt.
可尝试使用taskset绑定测试程序到CPU大核调度,对小模型CPU调度和驱动开销无法被GPU kernel耗时掩盖,影响会比较明显
@yunfanxiao 在小米11 888上进行cpu绑定尝试 taskset -a 70 ./benchmark -a GPU -w 10 -l 10 -m ResNet-50_f16.bolt 这样平均耗时从30ms 降低到27ms,可是这与文章里的 25ms还有一定的差距,请问还有什么是需要注意的
from bolt.
@peyer 我也遇到了一样的问题,请问下你root手机了吗?这个问题解决了吗
from bolt.
Related Issues (20)
- Elewise算子耗时较大 HOT 2
- android平台编译失败 HOT 6
- 如何设置运行时浮点精度为fp16 HOT 4
- 展开OCL kernel中的标量dot操作可以获得更高的GFLOPs HOT 2
- arm cpu dilated conv遇到nchw类型的输入会出错
- 是否支持BGEMM? HOT 3
- TinyBert模型经过post_training_quantization进行INT8量化后,在Linux_X86-64平台推理报错 HOT 4
- version 1.2.1 and 1.3.0 issues HOT 7
- x86_64编译报错 HOT 5
- I have a problem. Does bolt quantization support x86? HOT 2
- x2bolt转化工具报错 HOT 3
- 请问bolt有1.3.1的发布计划吗? HOT 2
- BNN 只支持conv算子么? HOT 1
- binary conv arm中input bit-packing像素顺序问题 HOT 6
- Kotlin MultiPlatform Library HOT 1
- 声音克隆的demo可以参考一下吗? HOT 1
- Unable to compile jpeg on Windows HOT 1
- 请问支持mips架构的板子吗 HOT 1
- Can't convert model to int8 precision with post_training_quantization HOT 3
- C API ducument don't exsit
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from bolt.