Comments (8)
建议导出时不要使用split的模型了,将--export_split
修改为--export
如果必须使用分段的模型,可以在模型文件夹下添加一个config.json
的文件,内容如下:
{
"is_single": false,
"backend_type": "cpu",
"thread_num": 4,
"precision": "low",
"memory": "low"
}
from mnn.
发现一点问题: 在MNN目录下,没有embeddings_bf16.bin, tokenizer.txt.
这个没有问题,现在都放在onnx里了,拷贝一下就好了
from mnn.
Qwen-1_8B-Chat 模型下面已经有了一个config.json,内容如下, 我把你上面的内容加到下面如下:
index 45c0d16..5138eea
--- a/modeling_qwen.py
+++ b/modeling_qwen.py
@@ -320,7 +320,46 @@ class QWenAttention(nn.Module):
warnings.warn("Failed to import KV cache kernels.")
self.cache_kernels = None
- "is_single": false,
- "backend_type": "cpu",
- "thread_num": 4,
- "precision": "low",
- "memory": "low",
"vocab_size": 151936
-}
\ No newline at end of file
+}
编译结果:编译出如下文件:
block_0.mnn block_13.mnn block_17.mnn block_20.mnn block_2.mnn block_6.mnn embeddings_bf16.bin
block_10.mnn block_14.mnn block_18.mnn block_21.mnn block_3.mnn block_7.mnn lm.mnn
block_11.mnn block_15.mnn block_19.mnn block_22.mnn block_4.mnn block_8.mnn tokenizer.txt
block_12.mnn block_16.mnn block_1.mnn block_23.mnn block_5.mnn block_9.mnn
结果:使用新的mnn文件加载运行,结果还是没有任何反馈。
日志:
06-25 12:48:24.609 15355 17504 D MNN_DEBUG: Java_com_mnn_llm_Chat_Submit
行 8325: 06-25 12:48:24.609 15355 17504 I System.out: [MNN_DEBUG] start response
行 8336: 06-25 12:48:24.659 15355 17504 D MNN_DEBUG: Java_com_mnn_llm_Chat_Response
行 8358: 06-25 12:48:24.709 15355 17504 D MNN_DEBUG: Java_com_mnn_llm_Chat_Response
行 8359: 06-25 12:48:24.759 15355 17504 D MNN_DEBUG: Java_com_mnn_llm_Chat_Response
行 8362: 06-25 12:48:24.810 15355 17504 D MNN_DEBUG: Java_com_mnn_llm_Chat_Response
from mnn.
我试了第二种方式,不使用分段的模型:
python llm_export.py
--path ../../modes/Qwen-1_8B-Chat
--type Qwen-1_8B-Chat
--export
--export_token
--export_mnn
--mnn_path ./qwen18b-chat-mnn
--onnx_path ./qwen18b-chat-onnx
--embed_bin
--embed_bf16
结果:1)qwen18b-chat-mnn 文件夹下只有2个文件:
llm.mnn llm.mnn.weight
2)qwen18b-chat-onnx 文件夹下只有4个文件
llm_config.json llm.onnx llm.onnx.data tokenizer.txt
缺少: embeddings_bf16.bin 文件,对吗?
我llm_export.py 命令工具使用的是 : https://github.com/wangzhaode/llm-export 对吗? 好像MNN仓库下也有llm_export.py 命令工具,到底用哪一个?
from mnn.
我试了第二种方式,不使用分段的模型: python llm_export.py --path ../../modes/Qwen-1_8B-Chat --type Qwen-1_8B-Chat --export --export_token --export_mnn --mnn_path ./qwen18b-chat-mnn --onnx_path ./qwen18b-chat-onnx --embed_bin --embed_bf16
结果:1)qwen18b-chat-mnn 文件夹下只有2个文件: llm.mnn llm.mnn.weight 2)qwen18b-chat-onnx 文件夹下只有4个文件 llm_config.json llm.onnx llm.onnx.data tokenizer.txt
缺少: embeddings_bf16.bin 文件,对吗? 我llm_export.py 命令工具使用的是 : https://github.com/wangzhaode/llm-export 对吗? 好像MNN仓库下也有llm_export.py 命令工具,到底用哪一个?
没有加 --export_embed
from mnn.
在钉钉中,讨论交流了,建议用MNN仓库下llm_export.py, 建议不用split 分片方式
from mnn.
加上--export_embed 参数之后,确实能够生成4个文件
embeddings_bf16.bin
llm.mnn
llm.mnn.weight
tokenizer.txt
问题是, https://github.com/wangzhaode/llm-export 仓库和 MNN仓库 生成的文件llm.mnn.weight 大小不一样。
/llm-export 的文件大小:
-rw-rw-r-- 1 cmdc2023 cmdc2023 622329856 Jun 26 18:02 embeddings_bf16.bin
-rw-rw-r-- 1 cmdc2023 cmdc2023 3251184 Jun 26 18:00 llm.mnn
-rw-rw-r-- 1 cmdc2023 cmdc2023 768407146 Jun 26 18:00 llm.mnn.weight
-rw-rw-r-- 1 cmdc2023 cmdc2023 1616272 Jun 26 18:02 tokenizer.txt
MNN 的文件大小:
-rw-r--r-- 1 root root 622329856 Jun 26 15:46 embeddings_bf16.bin
-rw-rw-r-- 1 cmdc2023 cmdc2023 3251184 Jun 26 15:37 llm.mnn
-rw-rw-r-- 1 cmdc2023 cmdc2023 765759594 Jun 26 15:37 llm.mnn.weight
-rw-r--r-- 1 root root 1616272 Jun 26 15:46 tokenizer.txt
问题是:还是不知道哪个是正确的?
from mnn.
把这四个文件加载到安装apk中,运行是崩溃
22194 22308 F libc : Fatal signal 11 (SIGSEGV), code 1 (SEGV_MAPERR), fault addr 0x0 in tid 22308 (Thread-7),
from mnn.
Related Issues (20)
- LSQQuantizer训练量化
- MNN CUDA: First inference returns zeros, subsequent inferences returns wrong output
- llm_export.py thorw AttributeError: 'NoneType' object has no attribute 'mnnconvert' HOT 2
- MNN模型能转为ONNX模型吗? HOT 1
- MNN存在内存泄漏情况 HOT 2
- Backend设置Precision_Low,部分图片输出结果正常,部分图片输出全0 HOT 1
- Segmentation fault while performing inference with a ResNet50 model that I converted from PyTorch to MNN HOT 1
- MNNConvert onnx,Run testMNNFromOnnx Fail,Segmentation fault (core dumped) HOT 1
- mnn_offline_quant离线量化失败
- 同一个MNN模型,同样的推理代码,windows运行推理正常,安卓和iOS必定崩溃 HOT 5
- 请问一下非四维度输入的模型怎么传输呢?是不是要添加维度。 HOT 2
- project/android/demo Android项目构建
- cmake时MNN_BUILD_TOOL不生效,导致make后没有产出modelCompare.out HOT 3
- 使用CoreML推理报错 HOT 2
- MAC 上编译hamony版MNN-2.9.0 报错 HOT 1
- 编译报错:no member named 'allCpuIdsSorted' in 'MNNCPUInfo'
- 2.9.0版本转换的模型
- 请问新版本的MNN可以在windows下的visual studio 2015进行编译成功吗,有相关的教程吗,谢谢
- Cannot convert ONNX/TFLITE model to MNN
- ./MNNConvert --fp16失效?
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from mnn.