🐛 Bug To Reproduce Steps to reproduce the be

and i got another error using command line <a target="_blank" rel="noopener norefe

Thank you <a class="user-mention notranslate" data-hovercard-type="user" data-hovercar

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

And when I try to convert weight according to the document <a href="https://llm.mlc.ai

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Thank you so much <a class="user-mention notranslate" data-hovercard-type="user" data-

Hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

[Bug] kv_cache_bytes missing in metadata(use Python API) about mlc-llm HOT 11 CLOSED

XJY990705 commented on June 12, 2024 1

[Bug] kv_cache_bytes missing in metadata(use Python API)

from mlc-llm.

Comments (11)

0xDEADFED5 commented on June 12, 2024 1

i had similar issues before. i think i fixed with pip install numpy==1.26.4

from mlc-llm.

XJY990705 commented on June 12, 2024

and i got another error using command line

from mlc-llm.

XJY990705 commented on June 12, 2024

And today I got another problem, and upgrade numpy came out no difference

from mlc-llm.

MasterJH5574 commented on June 12, 2024

Thank you @XJY990705 for reporting.

The first error is because the prebuilt library under path dist/prebuilt_libs/Llama-2-7b-chat-hf/Llama-2-7b-chat-hf-q4f16_1-cuda.so is outdated and does not contain the latest changes. We will update our documents to remove the incorrect examples. And meanwhile, could you just try the following for constructing the ChatModule?

cm = ChatModule(model="HF://mlc-ai/Llama-2-7b-chat-hf-q4f16_1-MLC")

I feel after trying construct ChatModule as above, you might again encounter the same issue you just report (Unsupported layout: 0). I don't have idea on top of my mind. Will discuss with others, see if we can address this, and report back.

from mlc-llm.

XJY990705 commented on June 12, 2024

@MasterJH5574 I got this problem though

from mlc-llm.

XJY990705 commented on June 12, 2024

And when I try to convert weight according to the document https://llm.mlc.ai/docs/compilation/convert_weights.html
I got the problem as below, and upgrading numpy did not work

(mlc-chat-venv) wangh@zj-MS-7B17:~/mlc-llm$ mlc_llm convert_weight ./dist/models/RedPajama-INCITE-Instruct-3B-v1/
--quantization q4f16_1
-o dist/RedPajama-INCITE-Instruct-3B-v1-q4f16_1-MLC
[2024-04-16 09:43:25] INFO auto_config.py:115: Found model configuration: dist/models/RedPajama-INCITE-Instruct-3B-v1/config.json
[2024-04-16 09:43:26] INFO auto_device.py:76: Found device: cuda:0
[2024-04-16 09:43:27] INFO auto_device.py:85: Not found device: rocm:0
[2024-04-16 09:43:28] INFO auto_device.py:85: Not found device: metal:0
[2024-04-16 09:43:29] INFO auto_device.py:76: Found device: vulkan:0
[2024-04-16 09:43:29] INFO auto_device.py:76: Found device: vulkan:1
[2024-04-16 09:43:30] INFO auto_device.py:85: Not found device: opencl:0
[2024-04-16 09:43:30] INFO auto_device.py:33: Using device: cuda:0
[2024-04-16 09:43:30] INFO auto_weight.py:70: Finding weights in: dist/models/RedPajama-INCITE-Instruct-3B-v1
[2024-04-16 09:43:30] INFO auto_weight.py:129: Found source weight format: huggingface-torch. Source configuration: dist/models/RedPajama-INCITE-Instruct-3B-v1/pytorch_model.bin
[2024-04-16 09:43:30] INFO auto_weight.py:167: Not found Huggingface Safetensor
[2024-04-16 09:43:30] INFO auto_weight.py:106: Using source weight configuration: dist/models/RedPajama-INCITE-Instruct-3B-v1/pytorch_model.bin. Use --source to override.
[2024-04-16 09:43:30] INFO auto_weight.py:110: Using source weight format: huggingface-torch. Use --source-format to override.
[2024-04-16 09:43:30] INFO auto_config.py:153: Found model type: gpt_neox. Use --model-type to override.
Weight conversion with arguments:
--config dist/models/RedPajama-INCITE-Instruct-3B-v1/config.json
--quantization GroupQuantize(name='q4f16_1', kind='group-quant', group_size=32, quantize_dtype='int4', storage_dtype='uint32', model_dtype='float16', linear_weight_layout='NK', quantize_embedding=True, quantize_final_fc=True, num_elem_per_storage=8, num_storage_per_group=4, max_int_value=7)
--model-type gpt_neox
--device cuda:0
--source dist/models/RedPajama-INCITE-Instruct-3B-v1/pytorch_model.bin
--source-format huggingface-torch
--output dist/RedPajama-INCITE-Instruct-3B-v1-q4f16_1-MLC
[2024-04-16 09:43:30] INFO gpt_neox_model.py:49: context_window_size not found in config.json. Falling back to max_position_embeddings (2048)
[2024-04-16 09:43:30] INFO gpt_neox_model.py:72: prefill_chunk_size defaults to context_window_size (2048)
Start storing to cache dist/RedPajama-INCITE-Instruct-3B-v1-q4f16_1-MLC
[2024-04-16 09:43:33] INFO huggingface_loader.py:184: Loading HF parameters from: dist/models/RedPajama-INCITE-Instruct-3B-v1/pytorch_model.bin

A module that was compiled using NumPy 1.x cannot be run in
NumPy 2.0.0rc1 as it may crash. To support both 1.x and 2.x
versions of NumPy, modules must be compiled with NumPy 2.0.
Some module may need to rebuild instead e.g. with 'pybind11>=2.12'.

If you are a user of the module, the easiest solution will be to
downgrade to 'numpy<2' or try to upgrade the affected module.
We expect that some modules will need time to support NumPy 2.

Traceback (most recent call last): File "/home/wangh/anaconda3/envs/mlc-chat-venv/bin/mlc_llm", line 33, in
sys.exit(load_entry_point('mlc-llm', 'console_scripts', 'mlc_llm')())
File "/home/wangh/mlc-llm/python/mlc_llm/main.py", line 29, in main
cli.main(sys.argv[2:])
File "/home/wangh/mlc-llm/python/mlc_llm/cli/convert_weight.py", line 87, in main
convert_weight(
File "/home/wangh/mlc-llm/python/mlc_llm/interface/convert_weight.py", line 182, in convert_weight
_convert_args(args)
File "/home/wangh/mlc-llm/python/mlc_llm/interface/convert_weight.py", line 146, in _convert_args
tvmjs.dump_ndarray_cache(
File "/home/wangh/anaconda3/envs/mlc-chat-venv/lib/python3.11/site-packages/tvm/contrib/tvmjs.py", line 210, in dump_ndarray_cache
for k, origin_v in param_generator:
File "/home/wangh/mlc-llm/python/mlc_llm/interface/convert_weight.py", line 123, in _param_generator
loader = LOADER[args.source_format](
File "/home/wangh/mlc-llm/python/mlc_llm/loader/huggingface_loader.py", line 88, in init
self._load_file(path)
File "/home/wangh/mlc-llm/python/mlc_llm/loader/huggingface_loader.py", line 188, in _load_file
for name, param in load_func(path):
File "/home/wangh/mlc-llm/python/mlc_llm/loader/utils.py", line 40, in load_torch_shard
import torch # pylint: disable=import-outside-toplevel
File "/home/wangh/anaconda3/envs/mlc-chat-venv/lib/python3.11/site-packages/torch/init.py", line 1477, in
from .functional import * # noqa: F403
File "/home/wangh/anaconda3/envs/mlc-chat-venv/lib/python3.11/site-packages/torch/functional.py", line 9, in
import torch.nn.functional as F
File "/home/wangh/anaconda3/envs/mlc-chat-venv/lib/python3.11/site-packages/torch/nn/init.py", line 1, in
from .modules import * # noqa: F403
File "/home/wangh/anaconda3/envs/mlc-chat-venv/lib/python3.11/site-packages/torch/nn/modules/init.py", line 35, in
from .transformer import TransformerEncoder, TransformerDecoder,
File "/home/wangh/anaconda3/envs/mlc-chat-venv/lib/python3.11/site-packages/torch/nn/modules/transformer.py", line 20, in
device: torch.device = torch.device(torch._C._get_default_device()), # torch.device('cpu'),
/home/wangh/anaconda3/envs/mlc-chat-venv/lib/python3.11/site-packages/torch/nn/modules/transformer.py:20: UserWarning: Failed to initialize NumPy: _ARRAY_API not found (Triggered internally at ../torch/csrc/utils/tensor_numpy.cpp:84.)
device: torch.device = torch.device(torch._C._get_default_device()), # torch.device('cpu'),
Traceback (most recent call last):
File "/home/wangh/anaconda3/envs/mlc-chat-venv/bin/mlc_llm", line 33, in
sys.exit(load_entry_point('mlc-llm', 'console_scripts', 'mlc_llm')())
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/wangh/mlc-llm/python/mlc_llm/main.py", line 29, in main
cli.main(sys.argv[2:])
File "/home/wangh/mlc-llm/python/mlc_llm/cli/convert_weight.py", line 87, in main
convert_weight(
File "/home/wangh/mlc-llm/python/mlc_llm/interface/convert_weight.py", line 182, in convert_weight
_convert_args(args)
File "/home/wangh/mlc-llm/python/mlc_llm/interface/convert_weight.py", line 146, in _convert_args
tvmjs.dump_ndarray_cache(
File "/home/wangh/anaconda3/envs/mlc-chat-venv/lib/python3.11/site-packages/tvm/contrib/tvmjs.py", line 210, in dump_ndarray_cache
for k, origin_v in param_generator:
File "/home/wangh/mlc-llm/python/mlc_llm/interface/convert_weight.py", line 123, in _param_generator
loader = LOADER[args.source_format](
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/wangh/mlc-llm/python/mlc_llm/loader/huggingface_loader.py", line 88, in init
self._load_file(path)
File "/home/wangh/mlc-llm/python/mlc_llm/loader/huggingface_loader.py", line 188, in _load_file
for name, param in load_func(path):
File "/home/wangh/mlc-llm/python/mlc_llm/loader/utils.py", line 50, in load_torch_shard
param = param.numpy()
^^^^^^^^^^^^^
RuntimeError: Numpy is not available

from mlc-llm.

XJY990705 commented on June 12, 2024

@0xDEADFED5 thank you !!! I tried and it works well

from mlc-llm.

XJY990705 commented on June 12, 2024

@MasterJH5574
I encountered the same issue when I use llava to run in chat module. After I successfully convert weights of llava and generate chat config, I got result as below, I feel it may concerned with the version of TVM, I install the latest version of TVM Unity using pip, and I will try to install earlier version from source and see if it works well.

from mlc-llm.

MasterJH5574 commented on June 12, 2024

Thank you so much @XJY990705, we are still looking into this to understand what is happening.

from mlc-llm.

MasterJH5574 commented on June 12, 2024

Hi @XJY990705 @0xDEADFED5 , we have fixed this. Please update the pip package and try again :-) and also please let me know how things go.

from mlc-llm.

XJY990705 commented on June 12, 2024

@MasterJH5574 I tried to install TVM-Unity using pip and it works well, thank you very much!!

from mlc-llm.

[Bug] kv_cache_bytes missing in metadata(use Python API) about mlc-llm HOT 11 CLOSED

Comments (11)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent