Giter Club home page Giter Club logo

Comments (8)

wangzhaode avatar wangzhaode commented on August 18, 2024

建议导出时不要使用split的模型了,将--export_split修改为--export

如果必须使用分段的模型,可以在模型文件夹下添加一个config.json的文件,内容如下:

{
    "is_single": false,

    "backend_type": "cpu",
    "thread_num": 4,
    "precision": "low",
    "memory": "low"
}

from mnn.

wangzhaode avatar wangzhaode commented on August 18, 2024

发现一点问题: 在MNN目录下,没有embeddings_bf16.bin, tokenizer.txt.

这个没有问题,现在都放在onnx里了,拷贝一下就好了

from mnn.

sunnyzhaohui avatar sunnyzhaohui commented on August 18, 2024

Qwen-1_8B-Chat 模型下面已经有了一个config.json,内容如下, 我把你上面的内容加到下面如下:
index 45c0d16..5138eea
--- a/modeling_qwen.py
+++ b/modeling_qwen.py
@@ -320,7 +320,46 @@ class QWenAttention(nn.Module):
warnings.warn("Failed to import KV cache kernels.")
self.cache_kernels = None

  • "is_single": false,
  • "backend_type": "cpu",
  • "thread_num": 4,
  • "precision": "low",
  • "memory": "low",
    "vocab_size": 151936
    -}
    \ No newline at end of file
    +}

编译结果:编译出如下文件:
block_0.mnn block_13.mnn block_17.mnn block_20.mnn block_2.mnn block_6.mnn embeddings_bf16.bin
block_10.mnn block_14.mnn block_18.mnn block_21.mnn block_3.mnn block_7.mnn lm.mnn
block_11.mnn block_15.mnn block_19.mnn block_22.mnn block_4.mnn block_8.mnn tokenizer.txt
block_12.mnn block_16.mnn block_1.mnn block_23.mnn block_5.mnn block_9.mnn

结果:使用新的mnn文件加载运行,结果还是没有任何反馈。
日志:
06-25 12:48:24.609 15355 17504 D MNN_DEBUG: Java_com_mnn_llm_Chat_Submit
行 8325: 06-25 12:48:24.609 15355 17504 I System.out: [MNN_DEBUG] start response
行 8336: 06-25 12:48:24.659 15355 17504 D MNN_DEBUG: Java_com_mnn_llm_Chat_Response
行 8358: 06-25 12:48:24.709 15355 17504 D MNN_DEBUG: Java_com_mnn_llm_Chat_Response
行 8359: 06-25 12:48:24.759 15355 17504 D MNN_DEBUG: Java_com_mnn_llm_Chat_Response
行 8362: 06-25 12:48:24.810 15355 17504 D MNN_DEBUG: Java_com_mnn_llm_Chat_Response

from mnn.

sunnyzhaohui avatar sunnyzhaohui commented on August 18, 2024

我试了第二种方式,不使用分段的模型:
python llm_export.py
--path ../../modes/Qwen-1_8B-Chat
--type Qwen-1_8B-Chat
--export
--export_token
--export_mnn
--mnn_path ./qwen18b-chat-mnn
--onnx_path ./qwen18b-chat-onnx
--embed_bin
--embed_bf16

结果:1)qwen18b-chat-mnn 文件夹下只有2个文件:
llm.mnn llm.mnn.weight
2)qwen18b-chat-onnx 文件夹下只有4个文件
llm_config.json llm.onnx llm.onnx.data tokenizer.txt

缺少: embeddings_bf16.bin 文件,对吗?
我llm_export.py 命令工具使用的是 : https://github.com/wangzhaode/llm-export 对吗? 好像MNN仓库下也有llm_export.py 命令工具,到底用哪一个?

from mnn.

wangzhaode avatar wangzhaode commented on August 18, 2024

我试了第二种方式,不使用分段的模型: python llm_export.py --path ../../modes/Qwen-1_8B-Chat --type Qwen-1_8B-Chat --export --export_token --export_mnn --mnn_path ./qwen18b-chat-mnn --onnx_path ./qwen18b-chat-onnx --embed_bin --embed_bf16

结果:1)qwen18b-chat-mnn 文件夹下只有2个文件: llm.mnn llm.mnn.weight 2)qwen18b-chat-onnx 文件夹下只有4个文件 llm_config.json llm.onnx llm.onnx.data tokenizer.txt

缺少: embeddings_bf16.bin 文件,对吗? 我llm_export.py 命令工具使用的是 : https://github.com/wangzhaode/llm-export 对吗? 好像MNN仓库下也有llm_export.py 命令工具,到底用哪一个?

没有加 --export_embed

from mnn.

sunnyzhaohui avatar sunnyzhaohui commented on August 18, 2024

在钉钉中,讨论交流了,建议用MNN仓库下llm_export.py, 建议不用split 分片方式

from mnn.

sunnyzhaohui avatar sunnyzhaohui commented on August 18, 2024

加上--export_embed 参数之后,确实能够生成4个文件
embeddings_bf16.bin
llm.mnn
llm.mnn.weight
tokenizer.txt

问题是, https://github.com/wangzhaode/llm-export 仓库和 MNN仓库 生成的文件llm.mnn.weight 大小不一样。
/llm-export 的文件大小:
-rw-rw-r-- 1 cmdc2023 cmdc2023 622329856 Jun 26 18:02 embeddings_bf16.bin
-rw-rw-r-- 1 cmdc2023 cmdc2023 3251184 Jun 26 18:00 llm.mnn
-rw-rw-r-- 1 cmdc2023 cmdc2023 768407146 Jun 26 18:00 llm.mnn.weight
-rw-rw-r-- 1 cmdc2023 cmdc2023 1616272 Jun 26 18:02 tokenizer.txt

MNN 的文件大小:
-rw-r--r-- 1 root root 622329856 Jun 26 15:46 embeddings_bf16.bin
-rw-rw-r-- 1 cmdc2023 cmdc2023 3251184 Jun 26 15:37 llm.mnn
-rw-rw-r-- 1 cmdc2023 cmdc2023 765759594 Jun 26 15:37 llm.mnn.weight
-rw-r--r-- 1 root root 1616272 Jun 26 15:46 tokenizer.txt

问题是:还是不知道哪个是正确的?

from mnn.

sunnyzhaohui avatar sunnyzhaohui commented on August 18, 2024

把这四个文件加载到安装apk中,运行是崩溃
22194 22308 F libc : Fatal signal 11 (SIGSEGV), code 1 (SEGV_MAPERR), fault addr 0x0 in tid 22308 (Thread-7),

from mnn.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.