Comments (19)
Hi, can you review this first? Then see what else you need. Thanks.
https://github.com/h2oai/h2ogpt/blob/main/docs/README_offline.md
from h2ogpt.
I gone through the link you mention. It has all the necessary steps about offline but without docker. I'm asking about to make only one image that have one base model, config, parser etc in it that can run h2ogpt offline successfully.
from h2ogpt.
The command i ran after creating the docker image:
sudo docker run --gpus '"device=0"' --runtime=nvidia --shm-size=2g -p $GRADIO_SERVER_PORT:$GRADIO_SERVER_PORT -p $OPENAI_SERVER_PORT:$OPENAI_SERVER_PORT --rm --init --network host -v /etc/passwd:/etc/passwd:ro -v /etc/group:/etc/group:ro -u
id -u:
id -g -v "${HOME}"/.cache:/workspace/.cache -v "${HOME}"/save:/workspace/save -v "${HOME}"/user_path:/workspace/user_path -v "${HOME}"/db_dir_UserData:/workspace/db_dir_UserData -v "${HOME}"/users:/workspace/users -v "${HOME}"/db_nonusers:/workspace/db_nonusers -v "${HOME}"/llamacpp_path:/workspace/llamacpp_path -e GRADIO_SERVER_PORT=$GRADIO_SERVER_PORT h2ogpt_3 /workspace/generate.py --base_model=mistralai/Mistral-7B-Instruct-v0.2 --use_safetensors=True --prompt_type=mistral --save_dir='/workspace/save/' --use_gpu_id=False --user_path=/workspace/user_path --langchain_mode="LLM" --langchain_modes="['UserData', 'LLM', 'MyData']" --score_model=None --max_max_new_tokens=4096 --max_new_tokens=2048 --openai_port=$OPENAI_SERVER_PORT \
And after running it trying to make connection with the internet for downloading the embedding model and thrown the following error:
WARNING: Published ports are discarded when using host network mode
Using Model mistralai/mistral-7b-instruct-v0.2
fatal: not a git repository (or any of the parent directories): .git
git_hash.txt failed to be found: [Errno 2] No such file or directory: 'git_hash.txt'
Traceback (most recent call last):
File "/h2ogpt_conda/envs/h2ogpt/lib/python3.10/site-packages/urllib3/connection.py", line 198, in _new_conn
sock = connection.create_connection(
File "/h2ogpt_conda/envs/h2ogpt/lib/python3.10/site-packages/urllib3/util/connection.py", line 60, in create_connection
for res in socket.getaddrinfo(host, port, family, socket.SOCK_STREAM):
File "/h2ogpt_conda/envs/h2ogpt/lib/python3.10/socket.py", line 955, in getaddrinfo
for res in _socket.getaddrinfo(host, port, family, type, proto, flags):
socket.gaierror: [Errno -3] Temporary failure in name resolution
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/h2ogpt_conda/envs/h2ogpt/lib/python3.10/site-packages/urllib3/connectionpool.py", line 793, in urlopen
response = self._make_request(
File "/h2ogpt_conda/envs/h2ogpt/lib/python3.10/site-packages/urllib3/connectionpool.py", line 491, in _make_request
raise new_e
File "/h2ogpt_conda/envs/h2ogpt/lib/python3.10/site-packages/urllib3/connectionpool.py", line 467, in _make_request
self._validate_conn(conn)
File "/h2ogpt_conda/envs/h2ogpt/lib/python3.10/site-packages/urllib3/connectionpool.py", line 1099, in _validate_conn
conn.connect()
File "/h2ogpt_conda/envs/h2ogpt/lib/python3.10/site-packages/urllib3/connection.py", line 616, in connect
self.sock = sock = self._new_conn()
File "/h2ogpt_conda/envs/h2ogpt/lib/python3.10/site-packages/urllib3/connection.py", line 205, in _new_conn
raise NameResolutionError(self.host, self, e) from e
urllib3.exceptions.NameResolutionError: <urllib3.connection.HTTPSConnection object at 0x71188cb4d780>: Failed to resolve '[huggingface.co](http://huggingface.co/)' ([Errno -3] Temporary failure in name resolution)
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/h2ogpt_conda/envs/h2ogpt/lib/python3.10/site-packages/requests/adapters.py", line 486, in send
resp = conn.urlopen(
File "/h2ogpt_conda/envs/h2ogpt/lib/python3.10/site-packages/urllib3/connectionpool.py", line 847, in urlopen
retries = retries.increment(
File "/h2ogpt_conda/envs/h2ogpt/lib/python3.10/site-packages/urllib3/util/retry.py", line 515, in increment
raise MaxRetryError(_pool, url, reason) from reason # type: ignore[arg-type]
urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='[huggingface.co](http://huggingface.co/)', port=443): Max retries exceeded with url: /api/models/hkunlp/instructor-large (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x71188cb4d780>: Failed to resolve '[huggingface.co](http://huggingface.co/)' ([Errno -3] Temporary failure in name resolution)"))
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/workspace/generate.py", line 20, in <module>
entrypoint_main()
File "/workspace/generate.py", line 16, in entrypoint_main
H2O_Fire(main)
File "/workspace/src/utils.py", line 72, in H2O_Fire
fire.Fire(component=component, command=args)
File "/h2ogpt_conda/envs/h2ogpt/lib/python3.10/site-packages/fire/core.py", line 141, in Fire
component_trace = _Fire(component, args, parsed_flag_args, context, name)
File "/h2ogpt_conda/envs/h2ogpt/lib/python3.10/site-packages/fire/core.py", line 475, in _Fire
component, remaining_args = _CallAndUpdateTrace(
File "/h2ogpt_conda/envs/h2ogpt/lib/python3.10/site-packages/fire/core.py", line 691, in _CallAndUpdateTrace
component = fn(*varargs, **kwargs)
File "/workspace/src/gen.py", line 1882, in main
model=get_embedding(use_openai_embedding, hf_embedding_model=hf_embedding_model,
File "/workspace/src/gpt_langchain.py", line 528, in get_embedding
embedding = HuggingFaceInstructEmbeddings(model_name=hf_embedding_model,
File "/h2ogpt_conda/envs/h2ogpt/lib/python3.10/site-packages/langchain_community/embeddings/huggingface.py", line 158, in __init__
self.client = INSTRUCTOR(
File "/h2ogpt_conda/envs/h2ogpt/lib/python3.10/site-packages/sentence_transformers/SentenceTransformer.py", line 87, in __init__
snapshot_download(model_name_or_path,
File "/h2ogpt_conda/envs/h2ogpt/lib/python3.10/site-packages/sentence_transformers/util.py", line 442, in snapshot_download
model_info = _api.model_info(repo_id=repo_id, revision=revision, token=token)
File "/h2ogpt_conda/envs/h2ogpt/lib/python3.10/site-packages/huggingface_hub/utils/_validators.py", line 114, in _inner_fn
return fn(*args, **kwargs)
File "/h2ogpt_conda/envs/h2ogpt/lib/python3.10/site-packages/huggingface_hub/hf_api.py", line 2300, in model_info
r = get_session().get(path, headers=headers, timeout=timeout, params=params)
File "/h2ogpt_conda/envs/h2ogpt/lib/python3.10/site-packages/requests/sessions.py", line 602, in get
return self.request("GET", url, **kwargs)
File "/h2ogpt_conda/envs/h2ogpt/lib/python3.10/site-packages/requests/sessions.py", line 589, in request
resp = self.send(prep, **send_kwargs)
File "/h2ogpt_conda/envs/h2ogpt/lib/python3.10/site-packages/requests/sessions.py", line 703, in send
r = adapter.send(request, **kwargs)
File "/h2ogpt_conda/envs/h2ogpt/lib/python3.10/site-packages/huggingface_hub/utils/_http.py", line 66, in send
return super().send(request, *args, **kwargs)
File "/h2ogpt_conda/envs/h2ogpt/lib/python3.10/site-packages/requests/adapters.py", line 519, in send
raise ConnectionError(e, request=request)
requests.exceptions.ConnectionError: (MaxRetryError('HTTPSConnectionPool(host=\'[huggingface.co](http://huggingface.co/)\', port=443): Max retries exceeded with url: /api/models/hkunlp/instructor-large (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x71188cb4d780>: Failed to resolve \'[huggingface.co](http://huggingface.co/)\' ([Errno -3] Temporary failure in name resolution)"))'), '(Request ID: 395f6b0b-82cc-4691-a4ce-1473ce23ebab)')
I also change the embedding model by --hf_embedding_model=sentence-transformers/all-MiniLM-L6-v2
and it already have in my .cache folder inside both the container and local system but then still trying to connect and thrown the error.
from h2ogpt.
Did you follow these kinds of instructions?
https://github.com/h2oai/h2ogpt/blob/93ed01b5d704f668094ede6dffcd6d64f81d6aee/docs/README_offline.md
i.e. this env should be set:
TRANSFORMERS_OFFLINE=1
and useful if pass to h2oGPT: --gradio_offline_level=2 --share=False
from h2ogpt.
I have been followed all the steps that you mention but it didn't work. This is my docker command with including TRANSFORMERS_OFFLINE=1
and --gradio_offline_level=2 --share=False
:
export HF_DATASETS_OFFLINE=1
export TRANSFORMERS_OFFLINE=1
export GRADIO_SERVER_PORT=7860
$export OPENAI_SERVER_PORT=5000
sudo docker run --gpus all --runtime=nvidia --shm-size=2g -p $GRADIO_SERVER_PORT:$GRADIO_SERVER_PORT -p $OPENAI_SERVER_PORT:$OPENAI_SERVER_PORT --rm --init --network host -v /etc/passwd:/etc/passwd:ro -v /etc/group:/etc/group:ro -u `id -u`:`id -g` -v "${HOME}"/.cache:/workspace/.cache -v "${HOME}"/save:/workspace/save -v "${HOME}"/user_path:/workspace/user_path -v "${HOME}"/db_dir_UserData:/workspace/db_dir_UserData -v "${HOME}"/users:/workspace/users -v "${HOME}"/db_nonusers:/workspace/db_nonusers -v "${HOME}"/llamacpp_path:/workspace/llamacpp_path -e GRADIO_SERVER_PORT=$GRADIO_SERVER_PORT h2ogpt_1 /workspace/generate.py --base_model=mistralai/Mistral-7B-Instruct-v0.2 --use_safetensors=True --prompt_type=mistral --save_dir='/workspace/save/' --use_gpu_id=False --user_path=/workspace/user_path --langchain_mode="LLM" --langchain_modes="['UserData', 'MyData', 'LLM']" --score_model=None --max_max_new_tokens=2048 --max_new_tokens=1024 --visible_visible_models=False openai_port=$OPENAI_SERVER_PORT
After running, it thrown the same error. I did also change the embedding model all-MiniLM-L6-v2 but it didn't work also.
from h2ogpt.
You need to pass the envs as a docker env like you have for the gradio port.
-e GRADIO_SERVER_PORT=$GRADIO_SERVER_PORT \
-e TRANSFORMERS_OFFLINE=$TRANSFORMERS_OFFLINE \
from h2ogpt.
Modified command:
sudo docker run --gpus all --runtime=nvidia --shm-size=2g -e TRANSFORMERS_OFFLINE=$TRANSFORMERS_OFFLINE -e HF_DATASETS_OFFLINE=$HF_DATASETS_OFFLINE -p $GRADIO_SERVER_PORT:$GRADIO_SERVER_PORT -p $OPENAI_SERVER_PORT:$OPENAI_SERVER_PORT --rm --init --network host -v /etc/passwd:/etc/passwd:ro -v /etc/group:/etc/group:ro -u
id -u:
id -g -v "${HOME}"/.cache:/workspace/.cache -v "${HOME}"/save:/workspace/save -v "${HOME}"/user_path:/workspace/user_path -v "${HOME}"/db_dir_UserData:/workspace/db_dir_UserData -v "${HOME}"/users:/workspace/users -v "${HOME}"/db_nonusers:/workspace/db_nonusers -v "${HOME}"/llamacpp_path:/workspace/llamacpp_path -e GRADIO_SERVER_PORT=$GRADIO_SERVER_PORT narad_3 /workspace/generate.py --base_model=mistralai/Mistral-7B-Instruct-v0.2 --use_safetensors=True --prompt_type=mistral --save_dir='/workspace/save/' --use_gpu_id=False --user_path=/workspace/user_path --langchain_mode="LLM" --langchain_modes="['UserData', 'MyData', 'LLM']" --score_model=None --max_max_new_tokens=2048 --max_new_tokens=1024 --visible_visible_models=False openai_port=$OPENAI_SERVER_PORT
The error i got this time:
WARNING: Published ports are discarded when using host network mode
Using Model mistralai/mistral-7b-instruct-v0.2
fatal: not a git repository (or any of the parent directories): .git
git_hash.txt failed to be found: [Errno 2] No such file or directory: 'git_hash.txt'
Traceback (most recent call last):
File "/workspace/generate.py", line 20, in <module>
entrypoint_main()
File "/workspace/generate.py", line 16, in entrypoint_main
H2O_Fire(main)
File "/workspace/src/utils.py", line 72, in H2O_Fire
fire.Fire(component=component, command=args)
File "/h2ogpt_conda/envs/h2ogpt/lib/python3.10/site-packages/fire/core.py", line 141, in Fire
component_trace = _Fire(component, args, parsed_flag_args, context, name)
File "/h2ogpt_conda/envs/h2ogpt/lib/python3.10/site-packages/fire/core.py", line 475, in _Fire
component, remaining_args = _CallAndUpdateTrace(
File "/h2ogpt_conda/envs/h2ogpt/lib/python3.10/site-packages/fire/core.py", line 691, in _CallAndUpdateTrace
component = fn(*varargs, **kwargs)
File "/workspace/src/gen.py", line 1882, in main
model=get_embedding(use_openai_embedding, hf_embedding_model=hf_embedding_model,
File "/workspace/src/gpt_langchain.py", line 528, in get_embedding
embedding = HuggingFaceInstructEmbeddings(model_name=hf_embedding_model,
File "/h2ogpt_conda/envs/h2ogpt/lib/python3.10/site-packages/langchain_community/embeddings/huggingface.py", line 158, in __init__
self.client = INSTRUCTOR(
File "/h2ogpt_conda/envs/h2ogpt/lib/python3.10/site-packages/sentence_transformers/SentenceTransformer.py", line 87, in __init__
snapshot_download(model_name_or_path,
File "/h2ogpt_conda/envs/h2ogpt/lib/python3.10/site-packages/sentence_transformers/util.py", line 442, in snapshot_download
model_info = _api.model_info(repo_id=repo_id, revision=revision, token=token)
File "/h2ogpt_conda/envs/h2ogpt/lib/python3.10/site-packages/huggingface_hub/utils/_validators.py", line 114, in _inner_fn
return fn(*args, **kwargs)
File "/h2ogpt_conda/envs/h2ogpt/lib/python3.10/site-packages/huggingface_hub/hf_api.py", line 2300, in model_info
r = get_session().get(path, headers=headers, timeout=timeout, params=params)
File "/h2ogpt_conda/envs/h2ogpt/lib/python3.10/site-packages/requests/sessions.py", line 602, in get
return self.request("GET", url, **kwargs)
File "/h2ogpt_conda/envs/h2ogpt/lib/python3.10/site-packages/requests/sessions.py", line 589, in request
resp = self.send(prep, **send_kwargs)
File "/h2ogpt_conda/envs/h2ogpt/lib/python3.10/site-packages/requests/sessions.py", line 703, in send
r = adapter.send(request, **kwargs)
File "/h2ogpt_conda/envs/h2ogpt/lib/python3.10/site-packages/huggingface_hub/utils/_http.py", line 77, in send
raise OfflineModeIsEnabled(
huggingface_hub.errors.OfflineModeIsEnabled: Cannot reach https://huggingface.co/api/models/hkunlp/instructor-large: offline mode is enabled. To disable it, please unset the `HF_HUB_OFFLINE` environment variable.
from h2ogpt.
Good. Now that things function, the issue is that if you are offline, you need the models in place in the cache. Because you do not, it fails. So you need to follow the offline docs for one of the ways to get those models into the cache locations. This can range from running online first and using the product, to running the smart download way.
from h2ogpt.
- I already have all-MiniLM-L6-v2 in my container's .cache folder and when try pass the
--hf_embedding_model=sentence-transformers--all-MiniLM-L6-v2
in the command then it also shown the same error just like before. Now it trying to download the all-MiniLM-L6-v2 model instead of hkunlp/instructor-large. - My machine is connected with captive network so that i can't connect with internet.
from h2ogpt.
It's not related to h2oGPT at this point. Just do in python:
from InstructorEmbedding import INSTRUCTOR
model = INSTRUCTOR('hkunlp/instructor-large')
That is what the h2oGPT -> langchain code is doing. I presume this fails same way, and somehow you have to get that model in the right place.
If you place it somewhere manually because you have no internet, that probably won't be right.
It seems you can set SENTENCE_TRANSFORMERS_HOME
env to specify the location. Else it should be in the torch cache etc. See the sentence_transformer package.
e.g. it would be located here by default if nothing is set:
from torch.hub import _get_torch_home
torch_cache_home = _get_torch_home()
print(torch_cache_home)
e.g. /home/jon/.cache/torch
which looks like:
l(h2ogpt) jon@pseudotensor:~/h2ogpt$ ls -alrt /home/jon/.cache/torch
total 24
drwx------ 6 jon docker 4096 Feb 24 18:17 pyannote/
drwx------ 3 jon jon 4096 Feb 26 08:07 hub/
drwxrwxr-x 6 jon jon 4096 Feb 26 08:07 ./
drwxrwxr-x 2 jon jon 4096 Mar 21 13:09 kernels/
drwxrwxr-x 6 jon jon 4096 Apr 30 15:35 sentence_transformers/
drwx------ 47 jon jon 4096 May 9 23:40 ../
(h2ogpt) jon@pseudotensor:~/h2ogpt$
from h2ogpt.
Yes! It already set in the right palace.
~/.cache/torch/sentence_transformers$ ls -alrt total 12 drwxrwxr-x 4 sourav sourav 4096 May 6 11:39 .. drwxr-xr-x 4 sourav sourav 4096 May 6 11:41 hkunlp_instructor-large drwxr-xr-x 3 sourav sourav 4096 May 6 11:41 .
from h2ogpt.
@pseudotensor please guide me to implementing this.
from h2ogpt.
Hi @glenbhermon I'm really not sure what is wrong if you have the files in the expected place.
UKPLab/sentence-transformers#1725
UKPLab/sentence-transformers#2345
Seems should be no particular issue.
from h2ogpt.
You have mistakes in your run line, like missing -- before openai_port
and missing ` around the id for user group. Also, that model doesn't use safe tensors, so need to remove that line.
This works for me:
# 1) ensure $HOME/users and $HOME/db_nonusers are writeable by user running docker
export TRANSFORMERS_OFFLINE=1
export GRADIO_SERVER_PORT=7860
export OPENAI_SERVER_PORT=5000
export HF_HUB_OFFLINE=1
docker run --gpus all \
--runtime=nvidia \
--shm-size=2g \
-e TRANSFORMERS_OFFLINE=$TRANSFORMERS_OFFLINE \
-e HUGGING_FACE_HUB_TOKEN=$HUGGING_FACE_HUB_TOKEN \
-e HF_HUB_OFFLINE=$HF_HUB_OFFLINE \
-e HF_HOME="/workspace/.cache/huggingface/" \
-p $GRADIO_SERVER_PORT:$GRADIO_SERVER_PORT \
-p $OPENAI_SERVER_PORT:$OPENAI_SERVER_PORT \
--rm --init \
--network host \
-v /etc/passwd:/etc/passwd:ro \
-v /etc/group:/etc/group:ro \
-u `id -u`:`id -g` \
-v "${HOME}"/.cache/huggingface/:/workspace/.cache/huggingface \
-v "${HOME}"/.cache/torch/:/workspace/.cache/torch \
-v "${HOME}"/.cache/transformers/:/workspace/.cache/transformers \
-v "${HOME}"/save:/workspace/save \
-v "${HOME}"/user_path:/workspace/user_path \
-v "${HOME}"/db_dir_UserData:/workspace/db_dir_UserData \
-v "${HOME}"/users:/workspace/users \
-v "${HOME}"/db_nonusers:/workspace/db_nonusers \
-v "${HOME}"/llamacpp_path:/workspace/llamacpp_path \
-e GRADIO_SERVER_PORT=$GRADIO_SERVER_PORT \
gcr.io/vorvan/h2oai/h2ogpt-runtime:0.2.0 \
/workspace/generate.py \
--base_model=mistralai/Mistral-7B-Instruct-v0.2 \
--use_safetensors=False \
--prompt_type=mistral \
--save_dir='/workspace/save/' \
--use_gpu_id=False \
--user_path=/workspace/user_path \
--langchain_mode="LLM" \
--langchain_modes="['UserData', 'MyData', 'LLM']" \
--score_model=None \
--max_max_new_tokens=2048 \
--max_new_tokens=1024 \
--visible_visible_models=False \
--openai_port=$OPENAI_SERVER_PORT
Depending upon if use links, may require more specific mappings to direct location not linked location that cannot be used
-v "${HOME}"/.cache/huggingface/hub:/workspace/.cache/huggingface/hub \
-v "${HOME}"/.cache:/workspace/.cache \
-e TRANSFORMERS_CACHE="/workspace/.cache/" \
from h2ogpt.
Related Issues (20)
- ValueError: load_in_8bit must be a boolean HOT 5
- Question: correct prompts template for llama3-instruct HOT 9
- httpx.ConnectError with --openai_server=True --ssl-verify=False HOT 12
- h2ogpt on ubuntu server HOT 3
- branding capitalization HOT 1
- Support for https://huggingface.co/lightblue/suzume-llama-3-8B-multilingual HOT 3
- OCR issue HOT 1
- shared / personal collections HOT 1
- Failed to initial linux full script intallation HOT 2
- random assertion errors due to evaluate_nochat HOT 13
- h2ogpt vllm-check init-container stuck when istio injection
- GPU offloading mistralai_mistral-7b-instruct-v0.2 HOT 3
- Windows fatal exception: Access violation HOT 3
- Failed to load models HOT 2
- TimeoutError: answer_question_using_context timed out, took more than 60s
- doctr for scanned pdf HOT 6
- pytorch_model.bin 1.34G download hangs forever on Linux HOT 7
- umbrella podSecurityContext null values are always overwritten by sub-chart default values
- [Question] how model learn data from new document ? HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from h2ogpt.