Comments (3)
There is no --use_chat_template
parameter and if using chat template, it's automatically default behavior if the model has chat template and you don't pass --prompt_type
so I don't recommend passing that.
I fixed the particular issue, thanks.
python generate.py --use_safetensors=True
--max_seq_len=8192
--base_model=lightblue/suzume-llama-3-8B-multilingual
--use_auth_token=$HUGGING_FACE_HUB_TOKEN
--add_disk_models_to_ui=False
works now
from h2ogpt.
It already has a chat template, which means it would work OOTB.
from h2ogpt.
Documents Parsing doesnt work:
Step 1: Create DB
#!/bin/sh
export HUGGING_FACE_HUB_TOKEN=XXX
export CUDA_VISIBLE_DEVICES="0,1"
docker run \
--gpus all \
--runtime=nvidia \
--shm-size=64g \
-p 7860:7860 \
--rm --init \
-v /etc/passwd:/etc/passwd:ro \
-v /etc/group:/etc/group:ro \
-u 1000:1000 \
-v /opt/h2ogpt_data/.cache:/workspace/.cache \
-v /opt/h2ogpt_data/save:/workspace/save \
-v /opt/h2ogpt_data/db_dir_MyData:/workspace/db_dir_MyData \
-v /opt/h2ogpt_data/db_dir_UserData:/workspace/db_dir_UserData \
-v /opt/h2ogpt_data/tmp:/tmp \
-v /opt/h2ogpt_data/DATA:/workspace/user_path/DATA \
-e HUGGING_FACE_HUB_TOKEN=$HUGGING_FACE_HUB_TOKEN \
-e TOKENIZERS_PARALLELISM=false \
-e ADMIN_PASS=XXX \
-e CONCURRENCY_COUNT=1 \
gcr.io/vorvan/h2oai/h2ogpt-runtime:latest /workspace/src/make_db.py --db_type=chroma
# EOF
Step 2: Run
#!/bin/sh
export HUGGING_FACE_HUB_TOKEN=XXX
export CUDA_VISIBLE_DEVICES="0,1"
docker run \
--gpus all \
--runtime=nvidia \
--shm-size=64g \
-p 7860:7860 \
--rm --init \
-v /etc/passwd:/etc/passwd:ro \
-v /etc/group:/etc/group:ro \
-u 1000:1000 \
-v /opt/h2ogpt_data/.cache:/workspace/.cache \
-v /opt/h2ogpt_data/save:/workspace/save \
-v /opt/h2ogpt_data/db_dir_UserData:/workspace/db_dir_UserData \
-v /opt/h2ogpt_data/tmp:/tmp \
-v /opt/h2ogpt_data/DATA:/workspace/user_path/DATA \
-e HUGGING_FACE_HUB_TOKEN=$HUGGING_FACE_HUB_TOKEN \
-e TOKENIZERS_PARALLELISM=false \
-e ADMIN_PASS=XXX \
-e CONCURRENCY_COUNT=1 \
-e API_OPEN=1 \
-e ALLOW_API=1 \
gcr.io/vorvan/h2oai/h2ogpt-runtime:latest /workspace/generate.py \
--use_safetensors=True \
--prompt_type=unknown \
--max_seq_len=8192 \
--use_chat_template=True \
--base_model=lightblue/suzume-llama-3-8B-multilingual \
--save_dir='/workspace/save/' \
--use_auth_token=$HUGGING_FACE_HUB_TOKEN \
--use_gpu_id=False \
--allow_upload_to_user_data=False \
--allow_upload_to_my_data=True \
--enable_ocr='off' \
--enable_pdf_ocr='off' \
--langchain_mode='UserData' \
--langchain_modes="['LLM', 'UserData', 'MyData']" \
--user_path=/workspace/user_path \
--db_type=chroma \
--visible_h2ogpt_header=False \
--visible_doc_selection_tab=False \
--visible_doc_view_tab=False \
--visible_chat_history_tab=False \
--visible_expert_tab=False \
--visible_models_tab=False \
--visible_system_tab=False \
--visible_tos_tab=False \
--visible_hosts_tab=False
# EOF
Error:
Traceback (most recent call last):
File "/h2ogpt_conda/envs/h2ogpt/lib/python3.10/site-packages/gradio/queueing.py", line 566, in process_events
response = await route_utils.call_process_api(
... ... ...
File "/workspace/src/gpt_langchain.py", line 7985, in get_chain
assert hasattr(llm, 'chat_conversation')
AssertionError
from h2ogpt.
Related Issues (20)
- question regarding model_lock HOT 2
- Executing small model but missing config.json error with microsoft/Phi-3-mini-4k-instruct-gguf HOT 1
- Q and A not working for Youtube HOT 7
- sentence transformer version HOT 2
- Logging/Saving Settings and Instructions for Inference Jobs HOT 6
- Youtube ingestion doesn't work HOT 3
- Youtube chat does not work HOT 1
- Timestamps issue in Youtube Chat
- Document Content Presentation Difference Between Built-In UI and Custom UI using Gradio client HOT 1
- Change AutoGPT Agent Embeddings Model HOT 5
- Question:extracting preference data of clients' response HOT 3
- Does h2o have assistant API HOT 1
- Consider switching to Coqui TTS from new repo
- Can I use existing llama.cpp server as inference server?
- Is there a plan to incorporate a Knowledge Graph RAG Query Engine?
- Option to place relavent documents chunks in system prompt instead of user prompt
- AutoGPT issue running on Local LLM HOT 2
- Running H2ogpt with Ollama inference Server HOT 1
- Unable to Programmatically Receive Sources with Prompts & Responses
- h2o Windows installer "Web Search" and "Q/A"
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from h2ogpt.