Giter Club home page Giter Club logo

Comments (7)

han-sogawa avatar han-sogawa commented on June 18, 2024

https://huggingface.co/api/models/hkunlp/instructor-large is the file it cannot download, although I can access it on the browser.

from h2ogpt.

pseudotensor avatar pseudotensor commented on June 18, 2024

Are you using that as the base model? What is your actual generate.py line?

from h2ogpt.

han-sogawa avatar han-sogawa commented on June 18, 2024

No, it looks like it is another dependency, which attempts to download regardless of which base model I am using

One example of a generate.py line that I have tried:
python generate.py --base_model=meta-llama/llama-2-7b-chat-hf --score_model=None --langchain_mode='UserData' --user_path=user_path --use_auth_token=True --max_seq_len=4096 --max_max_new_tokens=2048

from h2ogpt.

han-sogawa avatar han-sogawa commented on June 18, 2024

I think this may be where it is entering to try to download the file:

embedding = HuggingFaceInstructEmbeddings(model_name=hf_embedding_model,

from h2ogpt.

pseudotensor avatar pseudotensor commented on June 18, 2024

What if you try a different embedding model, e.g. add to generate.py line:

--hf_embedding_model=sentence-transformers/all-MiniLM-L12-v2

Also, you can try disabling hf_transfer by setting this env:

export HF_HUB_ENABLE_HF_TRANSFER=0

from h2ogpt.

pseudotensor avatar pseudotensor commented on June 18, 2024

FYI this is what it looks like when running your command you gave:

(h2ogpt) jon@pseudotensor:~/h2ogpt$ python generate.py --base_model=meta-llama/llama-2-7b-chat-hf --score_model=None --langchain_mode='UserData' --user_path=user_path --use_auth_token=True --max_seq_len=4096 --max_max_new_tokens=2048
Using Model meta-llama/llama-2-7b-chat-hf
load INSTRUCTOR_Transformer
max_seq_length  512
Starting get_model: meta-llama/llama-2-7b-chat-hf 
/home/jon/miniconda3/envs/h2ogpt/lib/python3.10/site-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
  warnings.warn(
config.json: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 614/614 [00:00<00:00, 1.45MB/s]
Overriding max_seq_len -> 4096
tokenizer_config.json: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1.62k/1.62k [00:00<00:00, 3.93MB/s]
tokenizer.model: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 500k/500k [00:00<00:00, 8.89MB/s]
tokenizer.json: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1.84M/1.84M [00:00<00:00, 5.95MB/s]
special_tokens_map.json: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 414/414 [00:00<00:00, 876kB/s]
Overriding max_seq_len -> 4096
Overriding max_seq_len -> 4096
device_map: {'': 0}
pytorch_model.bin.index.json: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 26.8k/26.8k [00:00<00:00, 85.6MB/s]
pytorch_model-00001-of-00002.bin: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▉| 9.98G/9.98G [01:29<00:00, 112MB/s]
pytorch_model-00002-of-00002.bin: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▉| 3.50G/3.50G [00:31<00:00, 110MB/s]
Downloading shards: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [02:01<00:00, 60.77s/it]
Loading checkpoint shards: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:04<00:00,  2.30s/it]
generation_config.json: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 188/188 [00:00<00:00, 530kB/s]
Model {'base_model': 'meta-llama/llama-2-7b-chat-hf', 'base_model0': 'meta-llama/llama-2-7b-chat-hf', 'tokenizer_base_model': '', 'lora_weights': '', 'inference_server': '', 'prompt_type': 'llama2', 'prompt_dict': {'promptA': '', 'promptB': '', 'PreInstruct': "<s>[INST] <<SYS>>\nYou are a helpful, respectful and honest assistant. Always answer as helpfully as possible, while being safe. Your answers should not include any harmful, unethical, racist, sexist, toxic, dangerous, or illegal content. Please ensure that your responses are socially unbiased and positive in nature.\n\nIf a question does not make any sense, or is not factually coherent, explain why instead of answering something not correct. If you don't know the answer to a question, please don't share false information.\n<</SYS>>\n\n", 'PreInput': None, 'PreResponse': '[/INST]', 'terminate_response': ['[INST]', '</s>'], 'chat_sep': ' ', 'chat_turn_sep': ' </s>', 'humanstr': '[INST]', 'botstr': '[/INST]', 'generates_leading_space': False, 'system_prompt': "You are a helpful, respectful and honest assistant. Always answer as helpfully as possible, while being safe. Your answers should not include any harmful, unethical, racist, sexist, toxic, dangerous, or illegal content. Please ensure that your responses are socially unbiased and positive in nature.\n\nIf a question does not make any sense, or is not factually coherent, explain why instead of answering something not correct. If you don't know the answer to a question, please don't share false information.", 'can_handle_system_prompt': True}, 'display_name': 'meta-llama/llama-2-7b-chat-hf', 'visible_models': None, 'h2ogpt_key': None, 'load_8bit': False, 'load_4bit': False, 'low_bit_mode': 1, 'load_half': True, 'use_flash_attention_2': False, 'load_gptq': '', 'load_awq': '', 'load_exllama': False, 'use_safetensors': False, 'revision': None, 'use_gpu_id': True, 'gpu_id': 0, 'compile_model': None, 'use_cache': None, 'llamacpp_dict': {'n_gpu_layers': 100, 'use_mlock': True, 'n_batch': 1024, 'n_gqa': 0, 'model_path_llama': '', 'model_name_gptj': '', 'model_name_gpt4all_llama': '', 'model_name_exllama_if_no_config': ''}, 'rope_scaling': {}, 'max_seq_len': 4096, 'max_output_seq_len': None, 'exllama_dict': {}, 'gptq_dict': {}, 'attention_sinks': False, 'sink_dict': {}, 'truncation_generation': False, 'hf_model_dict': {}, 'force_seq2seq_type': False, 'force_t5_type': False, 'trust_remote_code': True}
Begin auto-detect HF cache text generation models
/home/jon/miniconda3/envs/h2ogpt/lib/python3.10/site-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
  warnings.warn(
No loading model philschmid/bart-large-cnn-samsum because is_encoder_decoder=True
/home/jon/.cache/huggingface/modules/transformers_modules/mosaicml/mpt-30b-instruct/68deee8b69383b30826ea2fc642ba170b89e4edd/configuration_mpt.py:114: UserWarning: alibi or rope is turned on, setting `learned_pos_emb` to `False.`
  warnings.warn(f'alibi or rope is turned on, setting `learned_pos_emb` to `False.`')
/home/jon/.cache/huggingface/modules/transformers_modules/mosaicml/mpt-30b-instruct/68deee8b69383b30826ea2fc642ba170b89e4edd/configuration_mpt.py:141: UserWarning: If not using a Prefix Language Model, we recommend setting "attn_impl" to "flash" instead of "triton".
  warnings.warn(UserWarning('If not using a Prefix Language Model, we recommend setting "attn_impl" to "flash" instead of "triton".'))
WARNING:transformers_modules.tiiuae.falcon-40b-instruct.ecb78d97ac356d098e79f0db222c9ce7c5d9ee5f.configuration_falcon:
WARNING: You are currently loading Falcon using legacy code contained in the model repository. Falcon has now been fully ported into the Hugging Face transformers library. For the most up-to-date and high-performance version of the Falcon model code, please update to the latest version of transformers and then load the model without the trust_remote_code=True argument.

No loading model openai/whisper-large-v3 because is_encoder_decoder=True
No loading model openai/whisper-base.en because is_encoder_decoder=True
No loading model h2oai/ggml because h2oai/ggml does not appear to have a file named config.json. Checkout 'https://huggingface.co/h2oai/ggml/main' for available files.
No loading model Systran/faster-whisper-large-v3 because is_encoder_decoder=True
No loading model openai/whisper-medium because is_encoder_decoder=True
No loading model philschmid/flan-t5-base-samsum because is_encoder_decoder=True
No loading model stabilityai/stable-diffusion-xl-refiner-1.0 because stabilityai/stable-diffusion-xl-refiner-1.0 does not appear to have a file named config.json. Checkout 'https://huggingface.co/stabilityai/stable-diffusion-xl-refiner-1.0/main' for available files.
No loading model distil-whisper/distil-large-v2 because is_encoder_decoder=True
No loading model tloen/alpaca-lora-7b because tloen/alpaca-lora-7b does not appear to have a file named config.json. Checkout 'https://huggingface.co/tloen/alpaca-lora-7b/main' for available files.
/home/jon/.cache/huggingface/modules/transformers_modules/mosaicml/mpt-7b/039e37745f00858f0e01e988383a8c4393b1a4f5/configuration_mpt.py:114: UserWarning: alibi or rope is turned on, setting `learned_pos_emb` to `False.`
  warnings.warn(f'alibi or rope is turned on, setting `learned_pos_emb` to `False.`')
/home/jon/.cache/huggingface/modules/transformers_modules/mosaicml/mpt-7b/039e37745f00858f0e01e988383a8c4393b1a4f5/configuration_mpt.py:141: UserWarning: If not using a Prefix Language Model, we recommend setting "attn_impl" to "flash" instead of "triton".
  warnings.warn(UserWarning('If not using a Prefix Language Model, we recommend setting "attn_impl" to "flash" instead of "triton".'))
No loading model distil-whisper/distil-large-v3 because is_encoder_decoder=True
No loading model microsoft/speecht5_hifigan because The checkpoint you are trying to load has model type `hifigan` but Transformers does not recognize this architecture. This could be because of an issue with the checkpoint, or because your version of Transformers is out of date.
No loading model unstructuredio/detectron2_faster_rcnn_R_50_FPN_3x because unstructuredio/detectron2_faster_rcnn_R_50_FPN_3x does not appear to have a file named config.json. Checkout 'https://huggingface.co/unstructuredio/detectron2_faster_rcnn_R_50_FPN_3x/main' for available files.
No loading model stabilityai/stable-diffusion-xl-base-1.0 because stabilityai/stable-diffusion-xl-base-1.0 does not appear to have a file named config.json. Checkout 'https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0/main' for available files.
No loading model Salesforce/blip2-flan-t5-xl because is_encoder_decoder=True
/home/jon/miniconda3/envs/h2ogpt/lib/python3.10/site-packages/transformers/models/llava/configuration_llava.py:103: FutureWarning: The `vocab_size` argument is deprecated and will be removed in v4.42, since it can be inferred from the `text_config`. Passing this argument has no effect
  warnings.warn(
/home/jon/miniconda3/envs/h2ogpt/lib/python3.10/site-packages/transformers/models/llava/configuration_llava.py:143: FutureWarning: The `vocab_size` attribute is deprecated and will be removed in v4.42, Please use `text_config.vocab_size` instead.
  warnings.warn(
No loading model google/pix2struct-textcaps-base because is_encoder_decoder=True
No loading model Salesforce/blip2-flan-t5-xxl because is_encoder_decoder=True
/home/jon/.cache/huggingface/modules/transformers_modules/mosaicml/mpt-30b-chat/28fc475f7b73a5631fbbc6419645c27177f275d4/configuration_mpt.py:114: UserWarning: alibi or rope is turned on, setting `learned_pos_emb` to `False.`
  warnings.warn(f'alibi or rope is turned on, setting `learned_pos_emb` to `False.`')
/home/jon/.cache/huggingface/modules/transformers_modules/mosaicml/mpt-30b-chat/28fc475f7b73a5631fbbc6419645c27177f275d4/configuration_mpt.py:141: UserWarning: If not using a Prefix Language Model, we recommend setting "attn_impl" to "flash" instead of "triton".
  warnings.warn(UserWarning('If not using a Prefix Language Model, we recommend setting "attn_impl" to "flash" instead of "triton".'))
No loading model microsoft/speecht5_vc because is_encoder_decoder=True
No loading model microsoft/speecht5_tts because is_encoder_decoder=True
End auto-detect HF cache text generation models
Begin auto-detect llama.cpp models
End auto-detect llama.cpp models
Running on local URL:  http://0.0.0.0:7860

To create a public link, set `share=True` in `launch()`.
Started Gradio Server and/or GUI: server_name: localhost port: 7860
Use local URL: http://localhost:7860/
/home/jon/miniconda3/envs/h2ogpt/lib/python3.10/site-packages/pydantic/_internal/_fields.py:160: UserWarning: Field "model_info" has conflict with protected namespace "model_".

You may be able to resolve this warning by setting `model_config['protected_namespaces'] = ()`.
  warnings.warn(
/home/jon/miniconda3/envs/h2ogpt/lib/python3.10/site-packages/pydantic/_internal/_fields.py:160: UserWarning: Field "model_names" has conflict with protected namespace "model_".

You may be able to resolve this warning by setting `model_config['protected_namespaces'] = ()`.
  warnings.warn(
OpenAI API URL: http://0.0.0.0:5000
INFO:__name__:OpenAI API URL: http://0.0.0.0:5000
OpenAI API key: EMPTY
INFO:__name__:OpenAI API key: EMPTY

All fine here.

If I remove the instructor-large model and try again:

(h2ogpt) jon@pseudotensor:~/h2ogpt$ rm -rf ~/.cache/torch/sentence_transformers/hkunlp_instructor-large/
(h2ogpt) jon@pseudotensor:~/h2ogpt$ python generate.py --base_model=meta-llama/llama-2-7b-chat-hf --score_model=None --langchain_mode='UserData' --user_path=user_path --use_auth_token=True --max_seq_len=4096 --max_max_new_tokens=2048
Using Model meta-llama/llama-2-7b-chat-hf
.gitattributes: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1.48k/1.48k [00:00<00:00, 3.83MB/s]
1_Pooling/config.json: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 270/270 [00:00<00:00, 792kB/s]
2_Dense/config.json: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 116/116 [00:00<00:00, 1.52MB/s]
pytorch_model.bin: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 3.15M/3.15M [00:00<00:00, 31.9MB/s]
README.md: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 66.3k/66.3k [00:00<00:00, 1.16MB/s]
config.json: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1.53k/1.53k [00:00<00:00, 3.43MB/s]
config_sentence_transformers.json: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 122/122 [00:00<00:00, 1.59MB/s]
pytorch_model.bin: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▉| 1.34G/1.34G [00:13<00:00, 100MB/s]
sentence_bert_config.json: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 53.0/53.0 [00:00<00:00, 116kB/s]
special_tokens_map.json: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2.20k/2.20k [00:00<00:00, 6.28MB/s]
spiece.model: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 792k/792k [00:00<00:00, 13.6MB/s]
tokenizer.json: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2.42M/2.42M [00:00<00:00, 12.8MB/s]
tokenizer_config.json: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2.41k/2.41k [00:00<00:00, 7.18MB/s]
modules.json: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 461/461 [00:00<00:00, 5.86MB/s]
load INSTRUCTOR_Transformer
... same as before

It downloads fine. So I guess you have some network complication.

from h2ogpt.

han-sogawa avatar han-sogawa commented on June 18, 2024

The workaround of adding --hf_embedding_model=sentence-transformers/all-MiniLM-L12-v2 worked for me, thank you! Still don't know why the instructor-large embedding file wouldn't download. I'll update if I find out more, but for now, my issue is resolved. Thank you very much!

from h2ogpt.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.