Giter Club home page Giter Club logo

Comments (5)

WhiteZz1 avatar WhiteZz1 commented on May 11, 2024

text-generation-launcher --model-id /root/autodl-tmp/bloom --num-shard 1 This is the startup command I entered, model-id specifies my local model path

from text-generation-inference.

OlivierDehaene avatar OlivierDehaene commented on May 11, 2024

model-id is the the name of the model on the HuggingFace hub. The weights will be downloaded automatically to your HUGGINGFACE_HUB_CACHE. If you want to specify were the weights are stored on your disk, you can specify a --weights-cache-override.
In your case: text-generation-launcher --model-id bigscience/bloom --weights-cache-override /root/autodl-tmp/bloom

from text-generation-inference.

WhiteZz1 avatar WhiteZz1 commented on May 11, 2024

I downloaded the complete model from huggingface. Why would I download it again after entering this command

(my-env) root@autodl-container-b369119e00-448c7f66:~/autodl-tmp/text-generation-inference# text-generation-launcher --model-id bigscience/bloom --weights-cache-override /root/autodl-tmp/bloom
2023-02-21T06:28:55.309769Z  INFO text_generation_launcher: Args { model_id: "bigscience/bloom", revision: None, num_shard: 1, quantize: false, max_concurrent_requests: 128, max_input_length: 1000, max_batch_size: 32, max_waiting_tokens: 20, port: 3000, shard_uds_path: "/tmp/text-generation-server", master_addr: "localhost", master_port: 29500, huggingface_hub_cache: None, weights_cache_override: Some("/root/autodl-tmp/bloom"), disable_custom_kernels: false, json_output: false, otlp_endpoint: None, cors_allow_origin: [] }
2023-02-21T06:28:55.309876Z  INFO text_generation_launcher: weights_cache_override is set to Some("/root/autodl-tmp/bloom").
2023-02-21T06:28:55.309890Z  INFO text_generation_launcher: Skipping download.
2023-02-21T06:28:55.310058Z  INFO text_generation_launcher: Starting shard 0
2023-02-21T06:29:05.320211Z  INFO text_generation_launcher: Waiting for shard 0 to be ready...
2023-02-21T06:29:12.805928Z ERROR shard-manager: text_generation_launcher: "Error when initializing model
Traceback (most recent call last):
  File \"/root/miniconda3/envs/my-env/lib/python3.9/site-packages/urllib3/connectionpool.py\", line 386, in _make_request
    self._validate_conn(conn)
  File \"/root/miniconda3/envs/my-env/lib/python3.9/site-packages/urllib3/connectionpool.py\", line 1042, in _validate_conn
    conn.connect()
  File \"/root/miniconda3/envs/my-env/lib/python3.9/site-packages/urllib3/connection.py\", line 414, in connect
    self.sock = ssl_wrap_socket(
  File \"/root/miniconda3/envs/my-env/lib/python3.9/site-packages/urllib3/util/ssl_.py\", line 449, in ssl_wrap_socket
    ssl_sock = _ssl_wrap_socket_impl(
  File \"/root/miniconda3/envs/my-env/lib/python3.9/site-packages/urllib3/util/ssl_.py\", line 493, in _ssl_wrap_socket_impl
    return ssl_context.wrap_socket(sock, server_hostname=server_hostname)
  File \"/root/miniconda3/envs/my-env/lib/python3.9/ssl.py\", line 500, in wrap_socket
    return self.sslsocket_class._create(
  File \"/root/miniconda3/envs/my-env/lib/python3.9/ssl.py\", line 1040, in _create
    self.do_handshake()
  File \"/root/miniconda3/envs/my-env/lib/python3.9/ssl.py\", line 1309, in do_handshake
    self._sslobj.do_handshake()
socket.timeout: _ssl.c:1105: The handshake operation timed out

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File \"/root/miniconda3/envs/my-env/lib/python3.9/site-packages/requests/adapters.py\", line 489, in send
    resp = conn.urlopen(
  File \"/root/miniconda3/envs/my-env/lib/python3.9/site-packages/urllib3/connectionpool.py\", line 787, in urlopen
    retries = retries.increment(
  File \"/root/miniconda3/envs/my-env/lib/python3.9/site-packages/urllib3/util/retry.py\", line 550, in increment
    raise six.reraise(type(error), error, _stacktrace)
  File \"/root/miniconda3/envs/my-env/lib/python3.9/site-packages/urllib3/packages/six.py\", line 770, in reraise
    raise value
  File \"/root/miniconda3/envs/my-env/lib/python3.9/site-packages/urllib3/connectionpool.py\", line 703, in urlopen
    httplib_response = self._make_request(
  File \"/root/miniconda3/envs/my-env/lib/python3.9/site-packages/urllib3/connectionpool.py\", line 389, in _make_request
    self._raise_timeout(err=e, url=url, timeout_value=conn.timeout)
  File \"/root/miniconda3/envs/my-env/lib/python3.9/site-packages/urllib3/connectionpool.py\", line 340, in _raise_timeout
    raise ReadTimeoutError(
urllib3.exceptions.ReadTimeoutError: HTTPSConnectionPool(host='cdn-lfs.huggingface.co', port=443): Read timed out. (read timeout=10.0)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File \"/root/miniconda3/envs/my-env/bin/text-generation-server\", line 8, in <module>
    sys.exit(app())
  File \"/root/miniconda3/envs/my-env/lib/python3.9/site-packages/typer/main.py\", line 311, in __call__
    return get_command(self)(*args, **kwargs)
  File \"/root/miniconda3/envs/my-env/lib/python3.9/site-packages/click/core.py\", line 1130, in __call__
    return self.main(*args, **kwargs)
  File \"/root/miniconda3/envs/my-env/lib/python3.9/site-packages/typer/core.py\", line 778, in main
    return _main(
  File \"/root/miniconda3/envs/my-env/lib/python3.9/site-packages/typer/core.py\", line 216, in _main
    rv = self.invoke(ctx)
  File \"/root/miniconda3/envs/my-env/lib/python3.9/site-packages/click/core.py\", line 1657, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File \"/root/miniconda3/envs/my-env/lib/python3.9/site-packages/click/core.py\", line 1404, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File \"/root/miniconda3/envs/my-env/lib/python3.9/site-packages/click/core.py\", line 760, in invoke
    return __callback(*args, **kwargs)
  File \"/root/miniconda3/envs/my-env/lib/python3.9/site-packages/typer/main.py\", line 683, in wrapper
    return callback(**use_params)  # type: ignore
  File \"/root/autodl-tmp/text-generation-inference/server/text_generation/cli.py\", line 55, in serve
    server.serve(model_id, revision, sharded, quantize, uds_path)
  File \"/root/autodl-tmp/text-generation-inference/server/text_generation/server.py\", line 130, in serve
    asyncio.run(serve_inner(model_id, revision, sharded, quantize))
  File \"/root/miniconda3/envs/my-env/lib/python3.9/asyncio/runners.py\", line 44, in run
    return loop.run_until_complete(main)
  File \"/root/miniconda3/envs/my-env/lib/python3.9/asyncio/base_events.py\", line 629, in run_until_complete
    self.run_forever()
  File \"/root/miniconda3/envs/my-env/lib/python3.9/asyncio/base_events.py\", line 596, in run_forever
    self._run_once()
  File \"/root/miniconda3/envs/my-env/lib/python3.9/asyncio/base_events.py\", line 1890, in _run_once
    handle._run()
  File \"/root/miniconda3/envs/my-env/lib/python3.9/asyncio/events.py\", line 80, in _run
    self._context.run(self._callback, *self._args)
> File \"/root/autodl-tmp/text-generation-inference/server/text_generation/server.py\", line 99, in serve_inner
    model = get_model(model_id, revision, sharded, quantize)
  File \"/root/autodl-tmp/text-generation-inference/server/text_generation/models/__init__.py\", line 59, in get_model
    return BLOOM(model_id, revision, quantize=quantize)
  File \"/root/autodl-tmp/text-generation-inference/server/text_generation/models/causal_lm.py\", line 253, in __init__
    self.model = AutoModelForCausalLM.from_pretrained(
  File \"/root/miniconda3/envs/my-env/lib/python3.9/site-packages/transformers-4.27.0.dev0-py3.9.egg/transformers/models/auto/auto_factory.py\", line 464, in from_pretrained
    return model_class.from_pretrained(
  File \"/root/miniconda3/envs/my-env/lib/python3.9/site-packages/transformers-4.27.0.dev0-py3.9.egg/transformers/modeling_utils.py\", line 2333, in from_pretrained
    resolved_archive_file, sharded_metadata = get_checkpoint_shard_files(
  File \"/root/miniconda3/envs/my-env/lib/python3.9/site-packages/transformers-4.27.0.dev0-py3.9.egg/transformers/utils/hub.py\", line 920, in get_checkpoint_shard_files
    cached_filename = cached_file(
  File \"/root/miniconda3/envs/my-env/lib/python3.9/site-packages/transformers-4.27.0.dev0-py3.9.egg/transformers/utils/hub.py\", line 410, in cached_file
    resolved_file = hf_hub_download(
  File \"/root/miniconda3/envs/my-env/lib/python3.9/site-packages/huggingface_hub/utils/_validators.py\", line 124, in _inner_fn
    return fn(*args, **kwargs)
  File \"/root/miniconda3/envs/my-env/lib/python3.9/site-packages/huggingface_hub/file_download.py\", line 1283, in hf_hub_download
    http_get(
  File \"/root/miniconda3/envs/my-env/lib/python3.9/site-packages/huggingface_hub/file_download.py\", line 503, in http_get
    r = _request_wrapper(
  File \"/root/miniconda3/envs/my-env/lib/python3.9/site-packages/huggingface_hub/file_download.py\", line 440, in _request_wrapper
    return http_backoff(
  File \"/root/miniconda3/envs/my-env/lib/python3.9/site-packages/huggingface_hub/utils/_http.py\", line 129, in http_backoff
    response = requests.request(method=method, url=url, **kwargs)
  File \"/root/miniconda3/envs/my-env/lib/python3.9/site-packages/requests/api.py\", line 59, in request
    return session.request(method=method, url=url, **kwargs)
  File \"/root/miniconda3/envs/my-env/lib/python3.9/site-packages/requests/sessions.py\", line 587, in request
    resp = self.send(prep, **send_kwargs)
  File \"/root/miniconda3/envs/my-env/lib/python3.9/site-packages/requests/sessions.py\", line 701, in send
    r = adapter.send(request, **kwargs)
  File \"/root/miniconda3/envs/my-env/lib/python3.9/site-packages/requests/adapters.py\", line 578, in send
    raise ReadTimeout(e, request=request)
requests.exceptions.ReadTimeout: HTTPSConnectionPool(host='cdn-lfs.huggingface.co', port=443): Read timed out. (read timeout=10.0)
" rank=0
2023-02-21T06:29:13.524601Z ERROR text_generation_launcher: Shard 0 failed to start:
We're not using custom kernels.
Traceback (most recent call last):

  File "/root/miniconda3/envs/my-env/lib/python3.9/site-packages/urllib3/connectionpool.py", line 386, in _make_request
    self._validate_conn(conn)

  File "/root/miniconda3/envs/my-env/lib/python3.9/site-packages/urllib3/connectionpool.py", line 1042, in _validate_conn
    conn.connect()

  File "/root/miniconda3/envs/my-env/lib/python3.9/site-packages/urllib3/connection.py", line 414, in connect
    self.sock = ssl_wrap_socket(

  File "/root/miniconda3/envs/my-env/lib/python3.9/site-packages/urllib3/util/ssl_.py", line 449, in ssl_wrap_socket
    ssl_sock = _ssl_wrap_socket_impl(

  File "/root/miniconda3/envs/my-env/lib/python3.9/site-packages/urllib3/util/ssl_.py", line 493, in _ssl_wrap_socket_impl
    return ssl_context.wrap_socket(sock, server_hostname=server_hostname)

  File "/root/miniconda3/envs/my-env/lib/python3.9/ssl.py", line 500, in wrap_socket
    return self.sslsocket_class._create(

  File "/root/miniconda3/envs/my-env/lib/python3.9/ssl.py", line 1040, in _create
    self.do_handshake()

  File "/root/miniconda3/envs/my-env/lib/python3.9/ssl.py", line 1309, in do_handshake
    self._sslobj.do_handshake()

socket.timeout: _ssl.c:1105: The handshake operation timed out


During handling of the above exception, another exception occurred:


Traceback (most recent call last):

  File "/root/miniconda3/envs/my-env/lib/python3.9/site-packages/requests/adapters.py", line 489, in send
    resp = conn.urlopen(

  File "/root/miniconda3/envs/my-env/lib/python3.9/site-packages/urllib3/connectionpool.py", line 787, in urlopen
    retries = retries.increment(

  File "/root/miniconda3/envs/my-env/lib/python3.9/site-packages/urllib3/util/retry.py", line 550, in increment
    raise six.reraise(type(error), error, _stacktrace)

  File "/root/miniconda3/envs/my-env/lib/python3.9/site-packages/urllib3/packages/six.py", line 770, in reraise
    raise value

  File "/root/miniconda3/envs/my-env/lib/python3.9/site-packages/urllib3/connectionpool.py", line 703, in urlopen
    httplib_response = self._make_request(

  File "/root/miniconda3/envs/my-env/lib/python3.9/site-packages/urllib3/connectionpool.py", line 389, in _make_request
    self._raise_timeout(err=e, url=url, timeout_value=conn.timeout)

  File "/root/miniconda3/envs/my-env/lib/python3.9/site-packages/urllib3/connectionpool.py", line 340, in _raise_timeout
    raise ReadTimeoutError(

urllib3.exceptions.ReadTimeoutError: HTTPSConnectionPool(host='cdn-lfs.huggingface.co', port=443): Read timed out. (read timeout=10.0)


During handling of the above exception, another exception occurred:


Traceback (most recent call last):

  File "/root/miniconda3/envs/my-env/bin/text-generation-server", line 8, in <module>
    sys.exit(app())

  File "/root/autodl-tmp/text-generation-inference/server/text_generation/cli.py", line 55, in serve
    server.serve(model_id, revision, sharded, quantize, uds_path)

  File "/root/autodl-tmp/text-generation-inference/server/text_generation/server.py", line 130, in serve
    asyncio.run(serve_inner(model_id, revision, sharded, quantize))

  File "/root/miniconda3/envs/my-env/lib/python3.9/asyncio/runners.py", line 44, in run
    return loop.run_until_complete(main)

  File "/root/miniconda3/envs/my-env/lib/python3.9/asyncio/base_events.py", line 642, in run_until_complete
    return future.result()

  File "/root/autodl-tmp/text-generation-inference/server/text_generation/server.py", line 99, in serve_inner
    model = get_model(model_id, revision, sharded, quantize)

  File "/root/autodl-tmp/text-generation-inference/server/text_generation/models/__init__.py", line 59, in get_model
    return BLOOM(model_id, revision, quantize=quantize)

  File "/root/autodl-tmp/text-generation-inference/server/text_generation/models/causal_lm.py", line 253, in __init__
    self.model = AutoModelForCausalLM.from_pretrained(

  File "/root/miniconda3/envs/my-env/lib/python3.9/site-packages/transformers-4.27.0.dev0-py3.9.egg/transformers/models/auto/auto_factory.py", line 464, in from_pretrained
    return model_class.from_pretrained(

  File "/root/miniconda3/envs/my-env/lib/python3.9/site-packages/transformers-4.27.0.dev0-py3.9.egg/transformers/modeling_utils.py", line 2333, in from_pretrained
    resolved_archive_file, sharded_metadata = get_checkpoint_shard_files(

  File "/root/miniconda3/envs/my-env/lib/python3.9/site-packages/transformers-4.27.0.dev0-py3.9.egg/transformers/utils/hub.py", line 920, in get_checkpoint_shard_files
    cached_filename = cached_file(

  File "/root/miniconda3/envs/my-env/lib/python3.9/site-packages/transformers-4.27.0.dev0-py3.9.egg/transformers/utils/hub.py", line 410, in cached_file
    resolved_file = hf_hub_download(

  File "/root/miniconda3/envs/my-env/lib/python3.9/site-packages/huggingface_hub/utils/_validators.py", line 124, in _inner_fn
    return fn(*args, **kwargs)

  File "/root/miniconda3/envs/my-env/lib/python3.9/site-packages/huggingface_hub/file_download.py", line 1283, in hf_hub_download
    http_get(

  File "/root/miniconda3/envs/my-env/lib/python3.9/site-packages/huggingface_hub/file_download.py", line 503, in http_get
    r = _request_wrapper(

  File "/root/miniconda3/envs/my-env/lib/python3.9/site-packages/huggingface_hub/file_download.py", line 440, in _request_wrapper
    return http_backoff(

  File "/root/miniconda3/envs/my-env/lib/python3.9/site-packages/huggingface_hub/utils/_http.py", line 129, in http_backoff
    response = requests.request(method=method, url=url, **kwargs)

  File "/root/miniconda3/envs/my-env/lib/python3.9/site-packages/requests/api.py", line 59, in request
    return session.request(method=method, url=url, **kwargs)

  File "/root/miniconda3/envs/my-env/lib/python3.9/site-packages/requests/sessions.py", line 587, in request
    resp = self.send(prep, **send_kwargs)

  File "/root/miniconda3/envs/my-env/lib/python3.9/site-packages/requests/sessions.py", line 701, in send
    r = adapter.send(request, **kwargs)

  File "/root/miniconda3/envs/my-env/lib/python3.9/site-packages/requests/adapters.py", line 578, in send
    raise ReadTimeout(e, request=request)

requests.exceptions.ReadTimeout: HTTPSConnectionPool(host='cdn-lfs.huggingface.co', port=443): Read timed out. (read timeout=10.0)


2023-02-21T06:29:13.524841Z  INFO text_generation_launcher: Shutting down shards

from text-generation-inference.

WhiteZz1 avatar WhiteZz1 commented on May 11, 2024

I downloaded bigscience/bloom on huggingface, except for_ The model.bin file was placed in the/root/autodl-tmp/bloom folder, and then I executed the command text-generation-launcher -- model-id bigscience/bloom -- weights-cache-override/root/autodl-tmp/bloom. The console kept printing "2023-02-21T07:04: 55.545088Z INFO text_generation_launcher: Waiting for shard 0 to be ready...". During the printing process, I checked the/root/autodl-tmp/bloom folder, I found a new folder "/models -- bigscience -- bloom/" in it. This should be the cache folder of the huggingface. I'm not sure. My understanding should be that I can use the corresponding model by specifying the model path that I have downloaded locally instead of the model cache path through the configuration of a parameter. But now I don't know what parameter to use. I have downloaded this bloom model file 3 to 4 times, and because of the network problem, the download is very slow. If you can help me, I will thank you very much @OlivierDehaene

from text-generation-inference.

WhiteZz1 avatar WhiteZz1 commented on May 11, 2024

I moved the locally downloaded file to the directory specified by "HF_HOME", and then executed the command "text generation launcher -- model id bigscience/bloom -- weights cache override/root/autodl tmp/bloom". Now it can run correctly! @OlivierDehaene thank you

from text-generation-inference.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.