Comments (11)
i had similar issues before. i think i fixed with pip install numpy==1.26.4
from mlc-llm.
and i got another error using command line
from mlc-llm.
And today I got another problem, and upgrade numpy came out no difference
from mlc-llm.
Thank you @XJY990705 for reporting.
The first error is because the prebuilt library under path dist/prebuilt_libs/Llama-2-7b-chat-hf/Llama-2-7b-chat-hf-q4f16_1-cuda.so
is outdated and does not contain the latest changes. We will update our documents to remove the incorrect examples. And meanwhile, could you just try the following for constructing the ChatModule?
cm = ChatModule(model="HF://mlc-ai/Llama-2-7b-chat-hf-q4f16_1-MLC")
I feel after trying construct ChatModule as above, you might again encounter the same issue you just report (Unsupported layout: 0
). I don't have idea on top of my mind. Will discuss with others, see if we can address this, and report back.
from mlc-llm.
@MasterJH5574 I got this problem though
from mlc-llm.
And when I try to convert weight according to the document https://llm.mlc.ai/docs/compilation/convert_weights.html
I got the problem as below, and upgrading numpy did not work
(mlc-chat-venv) wangh@zj-MS-7B17:~/mlc-llm$ mlc_llm convert_weight ./dist/models/RedPajama-INCITE-Instruct-3B-v1/
--quantization q4f16_1
-o dist/RedPajama-INCITE-Instruct-3B-v1-q4f16_1-MLC
[2024-04-16 09:43:25] INFO auto_config.py:115: Found model configuration: dist/models/RedPajama-INCITE-Instruct-3B-v1/config.json
[2024-04-16 09:43:26] INFO auto_device.py:76: Found device: cuda:0
[2024-04-16 09:43:27] INFO auto_device.py:85: Not found device: rocm:0
[2024-04-16 09:43:28] INFO auto_device.py:85: Not found device: metal:0
[2024-04-16 09:43:29] INFO auto_device.py:76: Found device: vulkan:0
[2024-04-16 09:43:29] INFO auto_device.py:76: Found device: vulkan:1
[2024-04-16 09:43:30] INFO auto_device.py:85: Not found device: opencl:0
[2024-04-16 09:43:30] INFO auto_device.py:33: Using device: cuda:0
[2024-04-16 09:43:30] INFO auto_weight.py:70: Finding weights in: dist/models/RedPajama-INCITE-Instruct-3B-v1
[2024-04-16 09:43:30] INFO auto_weight.py:129: Found source weight format: huggingface-torch. Source configuration: dist/models/RedPajama-INCITE-Instruct-3B-v1/pytorch_model.bin
[2024-04-16 09:43:30] INFO auto_weight.py:167: Not found Huggingface Safetensor
[2024-04-16 09:43:30] INFO auto_weight.py:106: Using source weight configuration: dist/models/RedPajama-INCITE-Instruct-3B-v1/pytorch_model.bin. Use --source
to override.
[2024-04-16 09:43:30] INFO auto_weight.py:110: Using source weight format: huggingface-torch. Use --source-format
to override.
[2024-04-16 09:43:30] INFO auto_config.py:153: Found model type: gpt_neox. Use --model-type
to override.
Weight conversion with arguments:
--config dist/models/RedPajama-INCITE-Instruct-3B-v1/config.json
--quantization GroupQuantize(name='q4f16_1', kind='group-quant', group_size=32, quantize_dtype='int4', storage_dtype='uint32', model_dtype='float16', linear_weight_layout='NK', quantize_embedding=True, quantize_final_fc=True, num_elem_per_storage=8, num_storage_per_group=4, max_int_value=7)
--model-type gpt_neox
--device cuda:0
--source dist/models/RedPajama-INCITE-Instruct-3B-v1/pytorch_model.bin
--source-format huggingface-torch
--output dist/RedPajama-INCITE-Instruct-3B-v1-q4f16_1-MLC
[2024-04-16 09:43:30] INFO gpt_neox_model.py:49: context_window_size not found in config.json. Falling back to max_position_embeddings (2048)
[2024-04-16 09:43:30] INFO gpt_neox_model.py:72: prefill_chunk_size defaults to context_window_size (2048)
Start storing to cache dist/RedPajama-INCITE-Instruct-3B-v1-q4f16_1-MLC
[2024-04-16 09:43:33] INFO huggingface_loader.py:184: Loading HF parameters from: dist/models/RedPajama-INCITE-Instruct-3B-v1/pytorch_model.bin
A module that was compiled using NumPy 1.x cannot be run in
NumPy 2.0.0rc1 as it may crash. To support both 1.x and 2.x
versions of NumPy, modules must be compiled with NumPy 2.0.
Some module may need to rebuild instead e.g. with 'pybind11>=2.12'.
If you are a user of the module, the easiest solution will be to
downgrade to 'numpy<2' or try to upgrade the affected module.
We expect that some modules will need time to support NumPy 2.
Traceback (most recent call last): File "/home/wangh/anaconda3/envs/mlc-chat-venv/bin/mlc_llm", line 33, in
sys.exit(load_entry_point('mlc-llm', 'console_scripts', 'mlc_llm')())
File "/home/wangh/mlc-llm/python/mlc_llm/main.py", line 29, in main
cli.main(sys.argv[2:])
File "/home/wangh/mlc-llm/python/mlc_llm/cli/convert_weight.py", line 87, in main
convert_weight(
File "/home/wangh/mlc-llm/python/mlc_llm/interface/convert_weight.py", line 182, in convert_weight
_convert_args(args)
File "/home/wangh/mlc-llm/python/mlc_llm/interface/convert_weight.py", line 146, in _convert_args
tvmjs.dump_ndarray_cache(
File "/home/wangh/anaconda3/envs/mlc-chat-venv/lib/python3.11/site-packages/tvm/contrib/tvmjs.py", line 210, in dump_ndarray_cache
for k, origin_v in param_generator:
File "/home/wangh/mlc-llm/python/mlc_llm/interface/convert_weight.py", line 123, in _param_generator
loader = LOADER[args.source_format](
File "/home/wangh/mlc-llm/python/mlc_llm/loader/huggingface_loader.py", line 88, in init
self._load_file(path)
File "/home/wangh/mlc-llm/python/mlc_llm/loader/huggingface_loader.py", line 188, in _load_file
for name, param in load_func(path):
File "/home/wangh/mlc-llm/python/mlc_llm/loader/utils.py", line 40, in load_torch_shard
import torch # pylint: disable=import-outside-toplevel
File "/home/wangh/anaconda3/envs/mlc-chat-venv/lib/python3.11/site-packages/torch/init.py", line 1477, in
from .functional import * # noqa: F403
File "/home/wangh/anaconda3/envs/mlc-chat-venv/lib/python3.11/site-packages/torch/functional.py", line 9, in
import torch.nn.functional as F
File "/home/wangh/anaconda3/envs/mlc-chat-venv/lib/python3.11/site-packages/torch/nn/init.py", line 1, in
from .modules import * # noqa: F403
File "/home/wangh/anaconda3/envs/mlc-chat-venv/lib/python3.11/site-packages/torch/nn/modules/init.py", line 35, in
from .transformer import TransformerEncoder, TransformerDecoder,
File "/home/wangh/anaconda3/envs/mlc-chat-venv/lib/python3.11/site-packages/torch/nn/modules/transformer.py", line 20, in
device: torch.device = torch.device(torch._C._get_default_device()), # torch.device('cpu'),
/home/wangh/anaconda3/envs/mlc-chat-venv/lib/python3.11/site-packages/torch/nn/modules/transformer.py:20: UserWarning: Failed to initialize NumPy: _ARRAY_API not found (Triggered internally at ../torch/csrc/utils/tensor_numpy.cpp:84.)
device: torch.device = torch.device(torch._C._get_default_device()), # torch.device('cpu'),
Traceback (most recent call last):
File "/home/wangh/anaconda3/envs/mlc-chat-venv/bin/mlc_llm", line 33, in
sys.exit(load_entry_point('mlc-llm', 'console_scripts', 'mlc_llm')())
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/wangh/mlc-llm/python/mlc_llm/main.py", line 29, in main
cli.main(sys.argv[2:])
File "/home/wangh/mlc-llm/python/mlc_llm/cli/convert_weight.py", line 87, in main
convert_weight(
File "/home/wangh/mlc-llm/python/mlc_llm/interface/convert_weight.py", line 182, in convert_weight
_convert_args(args)
File "/home/wangh/mlc-llm/python/mlc_llm/interface/convert_weight.py", line 146, in _convert_args
tvmjs.dump_ndarray_cache(
File "/home/wangh/anaconda3/envs/mlc-chat-venv/lib/python3.11/site-packages/tvm/contrib/tvmjs.py", line 210, in dump_ndarray_cache
for k, origin_v in param_generator:
File "/home/wangh/mlc-llm/python/mlc_llm/interface/convert_weight.py", line 123, in _param_generator
loader = LOADER[args.source_format](
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/wangh/mlc-llm/python/mlc_llm/loader/huggingface_loader.py", line 88, in init
self._load_file(path)
File "/home/wangh/mlc-llm/python/mlc_llm/loader/huggingface_loader.py", line 188, in _load_file
for name, param in load_func(path):
File "/home/wangh/mlc-llm/python/mlc_llm/loader/utils.py", line 50, in load_torch_shard
param = param.numpy()
^^^^^^^^^^^^^
RuntimeError: Numpy is not available
from mlc-llm.
@0xDEADFED5 thank you !!! I tried and it works well
from mlc-llm.
@MasterJH5574
I encountered the same issue when I use llava to run in chat module. After I successfully convert weights of llava and generate chat config, I got result as below, I feel it may concerned with the version of TVM, I install the latest version of TVM Unity using pip, and I will try to install earlier version from source and see if it works well.
from mlc-llm.
Thank you so much @XJY990705, we are still looking into this to understand what is happening.
from mlc-llm.
Hi @XJY990705 @0xDEADFED5 , we have fixed this. Please update the pip package and try again :-) and also please let me know how things go.
from mlc-llm.
@MasterJH5574 I tried to install TVM-Unity using pip and it works well, thank you very much!!
from mlc-llm.
Related Issues (20)
- [Feature Request] please allow f32q5_k and f16q5_k quantizations
- [Bug] FlashInfer decode BeginForward error an illegal instruction was encountered HOT 1
- mlc_llm package is ERROR: returned non-zero exit status[Bug] HOT 9
- [Question] Running mlc_llm into a multi-phase container build HOT 10
- [Question] Error when running debug_chat.py HOT 4
- [Bug] chatglm4 mlc_llm shows error "TVMError: Check failed: append_length > 0 (0 vs. 0) : Append with length 0 is not allowed." during mlc_llm chat CLI HOT 9
- [Doc] A typo in TVM installation docs HOT 1
- QWen1.8b acuracy in noquantize HOT 13
- [Question] Quantization Problems HOT 4
- [Question] Unable to download and compile custom model from Hugging Face using `mlc_llm package` command HOT 5
- [Bug] TVM_SOURCE_DIR In Windows 10. In the identification path is error HOT 2
- [Bug] TVM ERROR when convert_weight HOT 2
- [Bug] Apple Metal/MPS -- TVM/MLC-LLM won't compile from source HOT 6
- [Bug] Could NOT find Doxygen (missing: DOXYGEN_EXECUTABLE) HOT 4
- [Question] Android demo app change model HOT 3
- Qwen2-72B-Instruct MultiGPU 8xP100 HOT 7
- [Question] Is there a way to compute ppl of models in MLC-LLM? HOT 1
- [Bug] in android folder there is no library folder witch contains prepare_lib.sh ? how build android tvm.so ? HOT 2
- [Bug] in android folder there is no library folder witch contains prepare_lib.sh ? how build android tvm.so ? HOT 1
- Exiting all the time. Android, Redmi Note 13 pro plus [Bug] HOT 14
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from mlc-llm.