System Info <div class="highlight highlight-text-shell-session notranslate posit

Ok I managed to do what I want: cloning the model <

Shared volume using mountpoint-s3, permissions issues about text-generation-inference HOT 5 OPEN

Smana commented on May 23, 2024

Shared volume using mountpoint-s3, permissions issues

from text-generation-inference.

Comments (5)

Smana commented on May 23, 2024

Maybe that would be better to use this volume in read only. So I would just need to make them available in the bucket before starting the process?
Could you please guide me in identifying the procedure to provision the s3 bucket?

Thanks :)

from text-generation-inference.

Smana commented on May 23, 2024

Ok I managed to do what I want:

cloning the model

git clone https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2

sync the model in s3

aws s3 sync Mistral-7B-Instruct-v0.2 s3://<bucket_name>/Mistral-7B-Instruct-v0.2

use it under the pod

text-generation-launcher --model-id=/data/Mistral-7B-Instruct-v0.2 --quantize bitsandbytes-nf4
2024-04-26T12:57:37.746280Z  INFO text_generation_launcher: Args { model_id: "/data/Mistral-7B-Instruct-v0.2", revision: None, validation_workers: 2, sharded: None, num_shard: None, quantize: Some(BitsandbytesNF4), speculate: None, dtype: None, trust_remote_code: false, max_concurrent_requests: 128, max_best_of: 2, max_stop_sequences: 4, max_top_n_tokens: 5, max_input_length: 1024, max_total_tokens: 2048, waiting_served_ratio: 1.2, max_batch_prefill_tokens: 4096, max_batch_total_tokens: None, max_waiting_tokens: 20, max_batch_size: None, enable_cuda_graphs: false, hostname: "text-generation-inference-58d9869995-gxzx2", port: 80, shard_uds_path: "/tmp/text-generation-server", master_addr: "localhost", master_port: 29500, huggingface_hub_cache: Some("/data"), weights_cache_override: None, disable_custom_kernels: false, cuda_memory_fraction: 1.0, rope_scaling: None, rope_factor: None, json_output: false, otlp_endpoint: None, cors_allow_origin: [], watermark_gamma: None, watermark_delta: None, ngrok: false, ngrok_authtoken: None, ngrok_edge: None, tokenizer_config_path: None, disable_grammar_support: false, env: false }
2024-04-26T12:57:37.746720Z  INFO download: text_generation_launcher: Starting download process.
2024-04-26T12:57:48.114689Z  INFO text_generation_launcher: Files are already present on the host. Skipping download.

2024-04-26T12:57:50.144159Z  INFO download: text_generation_launcher: Successfully downloaded weights.
2024-04-26T12:57:50.144763Z  INFO shard-manager: text_generation_launcher: Starting shard rank=0
2024-04-26T12:58:00.242683Z  INFO shard-manager: text_generation_launcher: Waiting for shard to be ready... rank=0
2024-04-26T12:58:02.873865Z ERROR shard-manager: text_generation_launcher: Shard complete standard error output:
 rank=0
2024-04-26T12:58:02.873894Z ERROR shard-manager: text_generation_launcher: Shard process was signaled to shutdown with signal 9 rank=0
2024-04-26T12:58:02.944252Z ERROR text_generation_launcher: Shard 0 failed to start
2024-04-26T12:58:02.944282Z  INFO text_generation_launcher: Shutting down shards
Error: ShardCannotStart

I have another error that might not be related. I'm gonna solve that before closing this issue

from text-generation-inference.

Smana commented on May 23, 2024

Ok my first issue was caused by insufficient memory allocation.
Now I got this error

safetensors_rust.SafetensorError: Error while deserializing header: HeaderTooLarge

from text-generation-inference.

Smana commented on May 23, 2024

Well I managed to download the model using the recommanded way with huggingface-cli

huggingface-cli download mistralai/Mistral-7B-Instruct-v0.2
aws s3 sync /home/smana/.cache/huggingface/hub/models--mistralai--Mistral-7B-Instruct-v0.2 s3://<bucket>/models--mistralai--Mistral-7B-Instruct-v0.2

When the pod starts I still have permissions errors :/

text-generation-launcher --model-id=mistralai/Mistral-7B-Instruct-v0.2 --quantize bitsandbytes-nf4
...
2024-04-26T15:37:48.725974Z  INFO text_generation_launcher: Files are already present on the host. Skipping download.
...
PermissionError: [Errno 1] Operation not permitted: '/data/models--mistralai--Mistral-7B-Instruct-v0.2/tmp_7e2fd113-2af9-4a1a-bf0e-22d328d4bc8b'

from text-generation-inference.

SmaineTF1 commented on May 23, 2024

It is working much better with an EFS storage, but I let this issue open in case someone is able to find out a solution for the S3 mountpoint.

from text-generation-inference.

Shared volume using mountpoint-s3, permissions issues about text-generation-inference HOT 5 OPEN

Comments (5)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent