Giter Club home page Giter Club logo

360zhinao's People

Contributors

garycaokai avatar haoshengzou avatar yusifu avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

360zhinao's Issues

{ "msg": "name 'message' is not defined", "status_code": 500 }

4090服务器上,搭建完环境后,运行python openai_api.py ,服务启动正常,但是发送请求时,结果都是显示: { "msg": "name 'message' is not defined", "status_code": 500 }
这个是内部代码哪里报错了吗?

postpostman请求示例
image

Failed building wheel for flash-attn

7fa95e6af504d3c20085eebc7bff7bf

按照readme操作,遇到如下问题,也尝试自己解决了一下,还是无法编译 flash-attn

系统:win11
python: 3.10.0

cmake ,ninja, nvcc 如上图。

多模态大模型访问方式

hello,看介绍360智脑具备图文理解的能力,但是好像没找到可测试的体验入口?请问多模态的模型有开源或者技术blog之类的吗,谢谢。

运行微调时报错“torch.cuda.OutOfMemoryError: CUDA out of memory. ”

运行微调时出现如下报错信息:
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 14.48 GiB. GPU has a total capacity of 23.68 GiB of which 1.51 GiB is free. Including non-PyTorch memory, this process has 22.16 GiB memory in use. Of the allocated memory 21.79 GiB is allocated by PyTorch, and 18.10 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation. See documentation for Memory Management (https://pytorch.org/docs/stable/notes/cuda.html#environment-variables)

我的是双GPU GTX3090

项目运行360Zhinao-7B-Chat-32K 'NoneType' object is not callable

运行环境:
python = 3.11.7
pytorch = 2.2.2
transformers = 4.38.2
CUDA = 12.1

参考github官网:https://github.com/Qihoo360/360zhinao

  1. 按顺序安装
    pip install -r requirements.txt
    pip install flash_attn-2.5.6+cu118torch2.2cxx11abiTRUE-cp311-cp311-linux_x86_64.whl

  2. 从ModelScope社区下载360Zhinao-7B-Chat-32K
    from modelscope import snapshot_download
    model_dir_360Zhinao_7B_Chat_32K = snapshot_download("qihoo360/360Zhinao-7B-Chat-32K", revision = "master")

  3. 替换模型地址为
    MODEL_NAME_OR_PATH = "/home/zhifeng.zhao/.cache/modelscope/hub/qihoo360/360Zhinao-7B-Chat-32K"

  4. 运行streamlit run web_demo.py,报错'NoneType' object is not callable,详细报错信息见下方。

(360zhinao) xxx@xxx:~/360zhinao$ streamlit run web_demo.py

Collecting usage statistics. To deactivate, set browser.gatherUsageStats to false.

2024-04-17 17:58:25.122 Did not auto detect external IP.
Please go to https://docs.streamlit.io/ for debugging hints.

You can now view your Streamlit app in your browser.

Network URL: http://192.168.50.126:8501

Please install FlashAttention first, e.g., with pip install flash-attn
Loading checkpoint shards: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 8/8 [00:02<00:00, 2.71it/s]
ic| self.eos_token_id: 158326
self.pad_token_id: 158323
self.im_start_id: 158332
self.im_end_id: 158333
generation_config: GenerationConfig {
"do_sample": true,
"eos_token_id": [
158326,
158332,
158333
],
"max_new_tokens": 512,
"pad_token_id": 158326,
"top_p": 0.8
}

Exception in thread Thread-7 (generate):
Traceback (most recent call last):
File "/home/zhifeng.zhao/anaconda3/envs/360zhinao/lib/python3.11/threading.py", line 1045, in _bootstrap_inner
self.run()
File "/home/zhifeng.zhao/anaconda3/envs/360zhinao/lib/python3.11/threading.py", line 982, in run
self._target(*self._args, **self._kwargs)
File "/home/zhifeng.zhao/.cache/huggingface/modules/transformers_modules/360Zhinao-7B-Chat-32K/modeling_zhinao.py", line 918, in generate
response = super().generate(
^^^^^^^^^^^^^^^^^
File "/home/zhifeng.zhao/anaconda3/envs/360zhinao/lib/python3.11/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/home/zhifeng.zhao/anaconda3/envs/360zhinao/lib/python3.11/site-packages/transformers/generation/utils.py", line 1592, in generate
return self.sample(
^^^^^^^^^^^^
File "/home/zhifeng.zhao/anaconda3/envs/360zhinao/lib/python3.11/site-packages/transformers/generation/utils.py", line 2696, in sample
outputs = self(
^^^^^
File "/home/zhifeng.zhao/anaconda3/envs/360zhinao/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/zhifeng.zhao/anaconda3/envs/360zhinao/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/zhifeng.zhao/anaconda3/envs/360zhinao/lib/python3.11/site-packages/accelerate/hooks.py", line 166, in new_forward
output = module._old_forward(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/zhifeng.zhao/.cache/huggingface/modules/transformers_modules/360Zhinao-7B-Chat-32K/modeling_zhinao.py", line 816, in forward
outputs = self.model(
^^^^^^^^^^^
File "/home/zhifeng.zhao/anaconda3/envs/360zhinao/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/zhifeng.zhao/anaconda3/envs/360zhinao/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/zhifeng.zhao/.cache/huggingface/modules/transformers_modules/360Zhinao-7B-Chat-32K/modeling_zhinao.py", line 711, in forward
layer_outputs = decoder_layer(
^^^^^^^^^^^^^^
File "/home/zhifeng.zhao/anaconda3/envs/360zhinao/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/zhifeng.zhao/anaconda3/envs/360zhinao/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/zhifeng.zhao/anaconda3/envs/360zhinao/lib/python3.11/site-packages/accelerate/hooks.py", line 166, in new_forward
output = module._old_forward(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/zhifeng.zhao/.cache/huggingface/modules/transformers_modules/360Zhinao-7B-Chat-32K/modeling_zhinao.py", line 513, in forward
hidden_states, self_attn_weights, present_key_value = self.self_attn(
^^^^^^^^^^^^^^^
File "/home/zhifeng.zhao/anaconda3/envs/360zhinao/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/zhifeng.zhao/anaconda3/envs/360zhinao/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/zhifeng.zhao/anaconda3/envs/360zhinao/lib/python3.11/site-packages/accelerate/hooks.py", line 166, in new_forward
output = module._old_forward(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/zhifeng.zhao/.cache/huggingface/modules/transformers_modules/360Zhinao-7B-Chat-32K/modeling_zhinao.py", line 416, in forward
attn_output = self.flash_attention(query_states, key_states, value_states, attention_mask)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/zhifeng.zhao/.cache/huggingface/modules/transformers_modules/360Zhinao-7B-Chat-32K/modeling_zhinao.py", line 345, in flash_attention
query_states, key_states, value_states, indices_q, cu_seq_lens, max_seq_lens = self._upad_input(
^^^^^^^^^^^^^^^^^
File "/home/zhifeng.zhao/.cache/huggingface/modules/transformers_modules/360Zhinao-7B-Chat-32K/modeling_zhinao.py", line 442, in _upad_input
key_layer = index_first_axis(key_layer.reshape(batch_size * kv_seq_len, num_heads, head_dim), indices_k)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: 'NoneType' object is not callable

关于模型大小及需要gpu内存大小的问题

以下模型大小是多少,各需要多大内存的gpu才能跑起来,最好能在环境要求或模型下载里面说明一下:
360Zhinao-7B-Base
360Zhinao-7B-Chat-4K
360Zhinao-7B-Chat-32K
360Zhinao-7B-Chat-360K

问问题的时候出现错误,报错

模型:360Zhinao-7B-Chat-4K
环境:windows wsl2
显卡:nvidia 4090 ,24g
环境变量安装:已全部安装

报错如下:
Traceback (most recent call last):
File "/usr/lib/python3.11/threading.py", line 1038, in _bootstrap_inner
self.run()
File "/usr/lib/python3.11/threading.py", line 975, in run
self._target(*self._args, **self._kwargs)
File "/root/.cache/huggingface/modules/transformers_modules/qihoo360/360Zhinao-7B-Chat-4K/7ac2410120e0bd9a91baa92c0f3f973590dac490/modeling_zhinao.py", line 918, in generate
response = super().generate(
^^^^^^^^^^^^^^^^^
File "/mnt/c/Users/flyfo/Desktop/360zhinao/myenv/lib/python3.11/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/mnt/c/Users/flyfo/Desktop/360zhinao/myenv/lib/python3.11/site-packages/transformers/generation/utils.py", line 1592, in generate
return self.sample(
^^^^^^^^^^^^
File "/mnt/c/Users/flyfo/Desktop/360zhinao/myenv/lib/python3.11/site-packages/transformers/generation/utils.py", line 2696, in sample
outputs = self(
^^^^^
File "/mnt/c/Users/flyfo/Desktop/360zhinao/myenv/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/mnt/c/Users/flyfo/Desktop/360zhinao/myenv/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/root/.cache/huggingface/modules/transformers_modules/qihoo360/360Zhinao-7B-Chat-4K/7ac2410120e0bd9a91baa92c0f3f973590dac490/modeling_zhinao.py", line 816, in forward
outputs = self.model(
^^^^^^^^^^^
File "/mnt/c/Users/flyfo/Desktop/360zhinao/myenv/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/mnt/c/Users/flyfo/Desktop/360zhinao/myenv/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/root/.cache/huggingface/modules/transformers_modules/qihoo360/360Zhinao-7B-Chat-4K/7ac2410120e0bd9a91baa92c0f3f973590dac490/modeling_zhinao.py", line 711, in forward
layer_outputs = decoder_layer(
^^^^^^^^^^^^^^
File "/mnt/c/Users/flyfo/Desktop/360zhinao/myenv/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/mnt/c/Users/flyfo/Desktop/360zhinao/myenv/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/root/.cache/huggingface/modules/transformers_modules/qihoo360/360Zhinao-7B-Chat-4K/7ac2410120e0bd9a91baa92c0f3f973590dac490/modeling_zhinao.py", line 513, in forward
hidden_states, self_attn_weights, present_key_value = self.self_attn(
^^^^^^^^^^^^^^^
File "/mnt/c/Users/flyfo/Desktop/360zhinao/myenv/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/mnt/c/Users/flyfo/Desktop/360zhinao/myenv/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/root/.cache/huggingface/modules/transformers_modules/qihoo360/360Zhinao-7B-Chat-4K/7ac2410120e0bd9a91baa92c0f3f973590dac490/modeling_zhinao.py", line 416, in forward
attn_output = self.flash_attention(query_states, key_states, value_states, attention_mask)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/root/.cache/huggingface/modules/transformers_modules/qihoo360/360Zhinao-7B-Chat-4K/7ac2410120e0bd9a91baa92c0f3f973590dac490/modeling_zhinao.py", line 345, in flash_attention
query_states, key_states, value_states, indices_q, cu_seq_lens, max_seq_lens = self._upad_input(
^^^^^^^^^^^^^^^^^
File "/root/.cache/huggingface/modules/transformers_modules/qihoo360/360Zhinao-7B-Chat-4K/7ac2410120e0bd9a91baa92c0f3f973590dac490/modeling_zhinao.py", line 442, in _upad_input
key_layer = index_first_axis(key_layer.reshape(batch_size * kv_seq_len, num_heads, head_dim), indices_k)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: 'NoneType' object is not callable

Not compatible with vllm 4.0

File "/usr/local/lib/python3.10/dist-packages/vllm/model_executor/models/zhinao.py", line 31, in
from vllm.model_executor.input_metadata import InputMetadata
ModuleNotFoundError: No module named 'vllm.model_executor.input_metadata'

maybe should change from vllm.model_executor.input_metadata import InputMetadata to

from vllm.attention import Attention, AttentionMetadata

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.