yvonwin / qwen2.cpp Goto Github PK
View Code? Open in Web Editor NEWqwen2 and llama3 cpp implementation
License: Other
qwen2 and llama3 cpp implementation
License: Other
gpu版本会报错:
ggml_aligned_malloc: insufficient memory (attempted to allocate 876546.56 MB)
GGML_ASSERT: /tmp/pip-req-build-5898jsey/third_party/ggml/src/ggml.c:2327: ctx->mem_buffer != NULL
Could not attach to process. If your uid matches the uid of the target
process, check the setting of /proc/sys/kernel/yama/ptrace_scope, or try
again as the root user. For more details, see /etc/sysctl.d/10-ptrace.conf
ptrace: 不允许的操作.
No stack.
The program is not being run.
已放弃 (核心已转储)
请问是哪里导致的memory过载?
error: use of undeclared identifier 'ggml_metal_free'
auto operator()(ggml_metal_context *ctx) const noexcept -> void { ggml_metal_free(ctx); }
error: unknown type name 'ggml_metal_context'; did you mean 'ggml_opt_context'?
using unique_ggml_metal_context_t = std::unique_ptr<ggml_metal_context, ggml_metal_context_deleter_t>;
error: use of undeclared identifier 'ggml_metal_init'; did you mean 'ggml_numa_init'?
return unique_ggml_metal_context_t(ggml_metal_init(n_cb));
^~~~~~~~~~~~~~~
ggml_numa_init
MODEL=./qwen2_1.8b-ggml.bin python -m uvicorn qwen_cpp.openai_api:app --host 127.0.0.1 --port 8000 这种方式模型启动是运行在cpu上,使用build/bin/main是在gpu上,如何使用openai的方式设置只在gpu上启动模型服务
Can it be compiled on windows?
报了如下错误
aarch64-linux-gnu-gcc: 错误: unrecognized command line option ‘-mavx’
aarch64-linux-gnu-gcc: 错误: unrecognized command line option ‘-mavx2’
aarch64-linux-gnu-gcc: 错误: unrecognized command line option ‘-mfma’
aarch64-linux-gnu-gcc: 错误: unrecognized command line option ‘-mf16c’
aarch64-linux-gnu-gcc: 错误: unrecognized command line option ‘-msse3’
应该怎么解决?
GGML_ASSERT: /tmp/pip-req-build-obcizsli/third_party/ggml/src/ggml.c:2493: view_src == NULL || data_size + view_offs <= ggml_nbytes(view_src)
Could not attach to process. If your uid matches the uid of the target
process, check the setting of /proc/sys/kernel/yama/ptrace_scope, or try
again as the root user. For more details, see /etc/sysctl.d/10-ptrace.conf
ptrace: 对设备不适当的 ioctl 操作.
No stack.
The program is not being run.
max_length设置1024,推理报错,请问是转化时候要设置么
Model conversion requires cloning the original model,
but it takes a lot of time to download.
I would like to ask the author if he has a converted model to share.
附带的qwen.tiktoken报错:
$ ./build/bin/main -m Qwen2-7B-Instruct-q8.ggml -p 你想活出怎样的人生 -s "你是一个猫娘"
ggml_init_cublas: GGML_CUDA_FORCE_MMQ: no
ggml_init_cublas: CUDA_USE_TENSOR_CORES: yes
ggml_init_cublas: found 1 CUDA devices:
Device 0: NVIDIA GeForce RTX 3060, compute capability 8.6, VMM: yes
unknown token: 152063
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.