Giter Club home page Giter Club logo

Comments (5)

Smana avatar Smana commented on May 23, 2024

Maybe that would be better to use this volume in read only. So I would just need to make them available in the bucket before starting the process?
Could you please guide me in identifying the procedure to provision the s3 bucket?

Thanks :)

from text-generation-inference.

Smana avatar Smana commented on May 23, 2024

Ok I managed to do what I want:

  • cloning the model
git clone https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2
  • sync the model in s3
aws s3 sync Mistral-7B-Instruct-v0.2 s3://<bucket_name>/Mistral-7B-Instruct-v0.2
  • use it under the pod
text-generation-launcher --model-id=/data/Mistral-7B-Instruct-v0.2 --quantize bitsandbytes-nf4
2024-04-26T12:57:37.746280Z  INFO text_generation_launcher: Args { model_id: "/data/Mistral-7B-Instruct-v0.2", revision: None, validation_workers: 2, sharded: None, num_shard: None, quantize: Some(BitsandbytesNF4), speculate: None, dtype: None, trust_remote_code: false, max_concurrent_requests: 128, max_best_of: 2, max_stop_sequences: 4, max_top_n_tokens: 5, max_input_length: 1024, max_total_tokens: 2048, waiting_served_ratio: 1.2, max_batch_prefill_tokens: 4096, max_batch_total_tokens: None, max_waiting_tokens: 20, max_batch_size: None, enable_cuda_graphs: false, hostname: "text-generation-inference-58d9869995-gxzx2", port: 80, shard_uds_path: "/tmp/text-generation-server", master_addr: "localhost", master_port: 29500, huggingface_hub_cache: Some("/data"), weights_cache_override: None, disable_custom_kernels: false, cuda_memory_fraction: 1.0, rope_scaling: None, rope_factor: None, json_output: false, otlp_endpoint: None, cors_allow_origin: [], watermark_gamma: None, watermark_delta: None, ngrok: false, ngrok_authtoken: None, ngrok_edge: None, tokenizer_config_path: None, disable_grammar_support: false, env: false }
2024-04-26T12:57:37.746720Z  INFO download: text_generation_launcher: Starting download process.
2024-04-26T12:57:48.114689Z  INFO text_generation_launcher: Files are already present on the host. Skipping download.

2024-04-26T12:57:50.144159Z  INFO download: text_generation_launcher: Successfully downloaded weights.
2024-04-26T12:57:50.144763Z  INFO shard-manager: text_generation_launcher: Starting shard rank=0
2024-04-26T12:58:00.242683Z  INFO shard-manager: text_generation_launcher: Waiting for shard to be ready... rank=0
2024-04-26T12:58:02.873865Z ERROR shard-manager: text_generation_launcher: Shard complete standard error output:
 rank=0
2024-04-26T12:58:02.873894Z ERROR shard-manager: text_generation_launcher: Shard process was signaled to shutdown with signal 9 rank=0
2024-04-26T12:58:02.944252Z ERROR text_generation_launcher: Shard 0 failed to start
2024-04-26T12:58:02.944282Z  INFO text_generation_launcher: Shutting down shards
Error: ShardCannotStart

I have another error that might not be related. I'm gonna solve that before closing this issue

from text-generation-inference.

Smana avatar Smana commented on May 23, 2024

Ok my first issue was caused by insufficient memory allocation.
Now I got this error

safetensors_rust.SafetensorError: Error while deserializing header: HeaderTooLarge 

from text-generation-inference.

Smana avatar Smana commented on May 23, 2024

Well I managed to download the model using the recommanded way with huggingface-cli

huggingface-cli download mistralai/Mistral-7B-Instruct-v0.2
aws s3 sync /home/smana/.cache/huggingface/hub/models--mistralai--Mistral-7B-Instruct-v0.2 s3://<bucket>/models--mistralai--Mistral-7B-Instruct-v0.2

When the pod starts I still have permissions errors :/

text-generation-launcher --model-id=mistralai/Mistral-7B-Instruct-v0.2 --quantize bitsandbytes-nf4
...
2024-04-26T15:37:48.725974Z  INFO text_generation_launcher: Files are already present on the host. Skipping download.
...
PermissionError: [Errno 1] Operation not permitted: '/data/models--mistralai--Mistral-7B-Instruct-v0.2/tmp_7e2fd113-2af9-4a1a-bf0e-22d328d4bc8b'

from text-generation-inference.

SmaineTF1 avatar SmaineTF1 commented on May 23, 2024

It is working much better with an EFS storage, but I let this issue open in case someone is able to find out a solution for the S3 mountpoint.

from text-generation-inference.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.