Giter Club home page Giter Club logo

Comments (1)

bananasmoothii avatar bananasmoothii commented on April 18, 2024

I'm not sure if this is the same issue, but I tried to use a 2gpu model and got this:

terminate called after throwing an instance of 'std::runtime_error'
  what():  [FT][ERROR] shared_ft_model->getTensorParaSize() * shared_ft_model->getPipelineParaSize() == world_size Assertion fail: /workspace/build/fastertransformer_backend/src/libfastertransformer.cc:498
Complete logs
[+] Building 0.0s (1/4)                                                                                                                                                                                                                      docker:default
 => [copilot_proxy internal] load .dockerignore                                                                                                                                                                                                        0.0s
 => => transferring context: 2B                                                                                                                                                                                                                        0.0s
 => [copilot_proxy internal] load build definition from proxy.Dockerfile                                                                                                                                                                               0.0s
 => => transferring dockerfile: 307B                                                                                                                                                                                                 [+] Building 2.2s (17/17) FINISHED                                                                                                                                                                                    docker:default
 => [copilot_proxy internal] load .dockerignore                                                                                                                                                                                 0.0s
 => => transferring context: 2B                                                                                                                                                                                                 0.0s
 => [copilot_proxy internal] load build definition from proxy.Dockerfile                                                                                                                                                        0.1s
 => => transferring dockerfile: 307B                                                                                                                                                                                            0.0s
 => [triton internal] load .dockerignore                                                                                                                                                                                        0.1s
 => => transferring context: 2B                                                                                                                                                                                                 0.0s
 => [triton internal] load build definition from triton.Dockerfile                                                                                                                                                              0.1s
 => => transferring dockerfile: 325B                                                                                                                                                                                            0.0s
 => [copilot_proxy internal] load metadata for docker.io/library/python:3.10-slim-buster                                                                                                                                        1.7s
 => [triton internal] load metadata for docker.io/moyix/triton_with_ft:22.09                                                                                                                                                    0.8s
 => [triton 1/3] FROM docker.io/moyix/triton_with_ft:22.09@sha256:5a15c1f29c6b018967b49c588eb0ea67acbf897abb7f26e509ec21844574c9b1                                                                                              0.0s
 => CACHED [triton 2/3] RUN python3 -m pip install --disable-pip-version-check -U torch --extra-index-url https://download.pytorch.org/whl/cu116                                                                                0.0s
 => CACHED [triton 3/3] RUN python3 -m pip install --disable-pip-version-check -U transformers bitsandbytes accelerate                                                                                                          0.0s
 => [triton] exporting to image                                                                                                                                                                                                 0.0s
 => => exporting layers                                                                                                                                                                                                         0.0s
 => => writing image sha256:79dd3771c789003418dd215e18f816ca7e796d4d77a4de792907f7d8aa8a5bee                                                                                                                                    0.0s
 => => naming to docker.io/library/fauxpilot-triton                                                                                                                                                                             0.0s
 => [copilot_proxy 1/5] FROM docker.io/library/python:3.10-slim-buster@sha256:37aa274c2d001f09b14828450d903c55f821c90f225fdfdd80c5180fcca77b3f                                                                                  0.0s
 => [copilot_proxy internal] load build context                                                                                                                                                                                 0.3s
 => => transferring context: 1.10kB                                                                                                                                                                                             0.3s
 => CACHED [copilot_proxy 2/5] WORKDIR /python-docker                                                                                                                                                                           0.0s
 => CACHED [copilot_proxy 3/5] COPY copilot_proxy/requirements.txt requirements.txt                                                                                                                                             0.0s
 => CACHED [copilot_proxy 4/5] RUN pip3 install --no-cache-dir -r requirements.txt                                                                                                                                              0.0s
 => CACHED [copilot_proxy 5/5] COPY copilot_proxy .                                                                                                                                                                             0.0s
 => [copilot_proxy] exporting to image                                                                                                                                                                                          0.0s
 => => exporting layers                                                                                                                                                                                                         0.0s
 => => writing image sha256:6aaa5d89d067dcc60e23eed04bb393abeb1d1e62ff46fd6031ee15d63a480801                                                                                                                                    0.0s
 => => naming to docker.io/library/fauxpilot-copilot_proxy                                                                                                                                                                      0.0s
[+] Running 2/0
 ✔ Container fauxpilot-copilot_proxy-1  Created                                                                                                                                                                                 0.0s
 ✔ Container fauxpilot-triton-1         Created                                                                                                                                                                                 0.0s
Attaching to fauxpilot-copilot_proxy-1, fauxpilot-triton-1
fauxpilot-triton-1         |
fauxpilot-triton-1         | =============================
fauxpilot-triton-1         | == Triton Inference Server ==
fauxpilot-triton-1         | =============================
fauxpilot-triton-1         |
fauxpilot-triton-1         | NVIDIA Release 22.06 (build 39726160)
fauxpilot-triton-1         | Triton Server Version 2.23.0
fauxpilot-triton-1         |
fauxpilot-triton-1         | Copyright (c) 2018-2022, NVIDIA CORPORATION & AFFILIATES.  All rights reserved.
fauxpilot-triton-1         |
fauxpilot-triton-1         | Various files include modifications (c) NVIDIA CORPORATION & AFFILIATES.  All rights reserved.
fauxpilot-triton-1         |
fauxpilot-triton-1         | This container image and its contents are governed by the NVIDIA Deep Learning Container License.
fauxpilot-triton-1         | By pulling and using the container, you accept the terms and conditions of this license:
fauxpilot-triton-1         | https://developer.nvidia.com/ngc/nvidia-deep-learning-container-license
fauxpilot-copilot_proxy-1  | INFO:     Started server process [1]
fauxpilot-copilot_proxy-1  | INFO:     Waiting for application startup.
fauxpilot-copilot_proxy-1  | INFO:     Application startup complete.
fauxpilot-copilot_proxy-1  | INFO:     Uvicorn running on http://0.0.0.0:5000 (Press CTRL+C to quit)
fauxpilot-triton-1         |
fauxpilot-triton-1         | I1021 20:33:12.659520 88 pinned_memory_manager.cc:240] Pinned memory pool is created at '0x204e00000' with size 268435456
fauxpilot-triton-1         | I1021 20:33:12.659623 88 cuda_memory_manager.cc:105] CUDA memory pool is created on device 0 with size 67108864
fauxpilot-triton-1         | I1021 20:33:17.888475 88 model_repository_manager.cc:1191] loading: fastertransformer:1
fauxpilot-triton-1         | I1021 20:33:18.058662 88 libfastertransformer.cc:1226] TRITONBACKEND_Initialize: fastertransformer
fauxpilot-triton-1         | I1021 20:33:18.058688 88 libfastertransformer.cc:1236] Triton TRITONBACKEND API version: 1.10
fauxpilot-triton-1         | I1021 20:33:18.058691 88 libfastertransformer.cc:1242] 'fastertransformer' TRITONBACKEND API version: 1.10
fauxpilot-triton-1         | I1021 20:33:18.058712 88 libfastertransformer.cc:1274] TRITONBACKEND_ModelInitialize: fastertransformer (version 1)
fauxpilot-triton-1         | W1021 20:33:18.059506 88 libfastertransformer.cc:149] model configuration:
fauxpilot-triton-1         | {
fauxpilot-triton-1         |     "name": "fastertransformer",
fauxpilot-triton-1         |     "platform": "",
fauxpilot-triton-1         |     "backend": "fastertransformer",
fauxpilot-triton-1         |     "version_policy": {
fauxpilot-triton-1         |         "latest": {
fauxpilot-triton-1         |             "num_versions": 1
fauxpilot-triton-1         |         }
fauxpilot-triton-1         |     },
fauxpilot-triton-1         |     "max_batch_size": 1024,
fauxpilot-triton-1         |     "input": [
fauxpilot-triton-1         |         {
fauxpilot-triton-1         |             "name": "input_ids",
fauxpilot-triton-1         |             "data_type": "TYPE_UINT32",
fauxpilot-triton-1         |             "format": "FORMAT_NONE",
fauxpilot-triton-1         |             "dims": [
fauxpilot-triton-1         |                 -1
fauxpilot-triton-1         |             ],
fauxpilot-triton-1         |             "is_shape_tensor": false,
fauxpilot-triton-1         |             "allow_ragged_batch": false,
fauxpilot-triton-1         |             "optional": false
fauxpilot-triton-1         |         },
fauxpilot-triton-1         |         {
fauxpilot-triton-1         |             "name": "start_id",
fauxpilot-triton-1         |             "data_type": "TYPE_UINT32",
fauxpilot-triton-1         |             "format": "FORMAT_NONE",
fauxpilot-triton-1         |             "dims": [
fauxpilot-triton-1         |                 1
fauxpilot-triton-1         |             ],
fauxpilot-triton-1         |             "reshape": {
fauxpilot-triton-1         |                 "shape": []
fauxpilot-triton-1         |             },
fauxpilot-triton-1         |             "is_shape_tensor": false,
fauxpilot-triton-1         |             "allow_ragged_batch": false,
fauxpilot-triton-1         |             "optional": true
fauxpilot-triton-1         |         },
fauxpilot-triton-1         |         {
fauxpilot-triton-1         |             "name": "end_id",
fauxpilot-triton-1         |             "data_type": "TYPE_UINT32",
fauxpilot-triton-1         |             "format": "FORMAT_NONE",
fauxpilot-triton-1         |             "dims": [
fauxpilot-triton-1         |                 1
fauxpilot-triton-1         |             ],
fauxpilot-triton-1         |             "reshape": {
fauxpilot-triton-1         |                 "shape": []
fauxpilot-triton-1         |             },
fauxpilot-triton-1         |             "is_shape_tensor": false,
fauxpilot-triton-1         |             "allow_ragged_batch": false,
fauxpilot-triton-1         |             "optional": true
fauxpilot-triton-1         |         },
fauxpilot-triton-1         |         {
fauxpilot-triton-1         |             "name": "input_lengths",
fauxpilot-triton-1         |             "data_type": "TYPE_UINT32",
fauxpilot-triton-1         |             "format": "FORMAT_NONE",
fauxpilot-triton-1         |             "dims": [
fauxpilot-triton-1         |                 1
fauxpilot-triton-1         |             ],
fauxpilot-triton-1         |             "reshape": {
fauxpilot-triton-1         |                 "shape": []
fauxpilot-triton-1         |             },
fauxpilot-triton-1         |             "is_shape_tensor": false,
fauxpilot-triton-1         |             "allow_ragged_batch": false,
fauxpilot-triton-1         |             "optional": false
fauxpilot-triton-1         |         },
fauxpilot-triton-1         |         {
fauxpilot-triton-1         |             "name": "request_output_len",
fauxpilot-triton-1         |             "data_type": "TYPE_UINT32",
fauxpilot-triton-1         |             "format": "FORMAT_NONE",
fauxpilot-triton-1         |             "dims": [
fauxpilot-triton-1         |                 -1
fauxpilot-triton-1         |             ],
fauxpilot-triton-1         |             "is_shape_tensor": false,
fauxpilot-triton-1         |             "allow_ragged_batch": false,
fauxpilot-triton-1         |             "optional": false
fauxpilot-triton-1         |         },
fauxpilot-triton-1         |         {
fauxpilot-triton-1         |             "name": "runtime_top_k",
fauxpilot-triton-1         |             "data_type": "TYPE_UINT32",
fauxpilot-triton-1         |             "format": "FORMAT_NONE",
fauxpilot-triton-1         |             "dims": [
fauxpilot-triton-1         |                 1
fauxpilot-triton-1         |             ],
fauxpilot-triton-1         |             "reshape": {
fauxpilot-triton-1         |                 "shape": []
fauxpilot-triton-1         |             },
fauxpilot-triton-1         |             "is_shape_tensor": false,
fauxpilot-triton-1         |             "allow_ragged_batch": false,
fauxpilot-triton-1         |             "optional": true
fauxpilot-triton-1         |         },
fauxpilot-triton-1         |         {
fauxpilot-triton-1         |             "name": "runtime_top_p",
fauxpilot-triton-1         |             "data_type": "TYPE_FP32",
fauxpilot-triton-1         |             "format": "FORMAT_NONE",
fauxpilot-triton-1         |             "dims": [
fauxpilot-triton-1         |                 1
fauxpilot-triton-1         |             ],
fauxpilot-triton-1         |             "reshape": {
fauxpilot-triton-1         |                 "shape": []
fauxpilot-triton-1         |             },
fauxpilot-triton-1         |             "is_shape_tensor": false,
fauxpilot-triton-1         |             "allow_ragged_batch": false,
fauxpilot-triton-1         |             "optional": true
fauxpilot-triton-1         |         },
fauxpilot-triton-1         |         {
fauxpilot-triton-1         |             "name": "beam_search_diversity_rate",
fauxpilot-triton-1         |             "data_type": "TYPE_FP32",
fauxpilot-triton-1         |             "format": "FORMAT_NONE",
fauxpilot-triton-1         |             "dims": [
fauxpilot-triton-1         |                 1
fauxpilot-triton-1         |             ],
fauxpilot-triton-1         |             "reshape": {
fauxpilot-triton-1         |                 "shape": []
fauxpilot-triton-1         |             },
fauxpilot-triton-1         |             "is_shape_tensor": false,
fauxpilot-triton-1         |             "allow_ragged_batch": false,
fauxpilot-triton-1         |             "optional": true
fauxpilot-triton-1         |         },
fauxpilot-triton-1         |         {
fauxpilot-triton-1         |             "name": "temperature",
fauxpilot-triton-1         |             "data_type": "TYPE_FP32",
fauxpilot-triton-1         |             "format": "FORMAT_NONE",
fauxpilot-triton-1         |             "dims": [
fauxpilot-triton-1         |                 1
fauxpilot-triton-1         |             ],
fauxpilot-triton-1         |             "reshape": {
fauxpilot-triton-1         |                 "shape": []
fauxpilot-triton-1         |             },
fauxpilot-triton-1         |             "is_shape_tensor": false,
fauxpilot-triton-1         |             "allow_ragged_batch": false,
fauxpilot-triton-1         |             "optional": true
fauxpilot-triton-1         |         },
fauxpilot-triton-1         |         {
fauxpilot-triton-1         |             "name": "len_penalty",
fauxpilot-triton-1         |             "data_type": "TYPE_FP32",
fauxpilot-triton-1         |             "format": "FORMAT_NONE",
fauxpilot-triton-1         |             "dims": [
fauxpilot-triton-1         |                 1
fauxpilot-triton-1         |             ],
fauxpilot-triton-1         |             "reshape": {
fauxpilot-triton-1         |                 "shape": []
fauxpilot-triton-1         |             },
fauxpilot-triton-1         |             "is_shape_tensor": false,
fauxpilot-triton-1         |             "allow_ragged_batch": false,
fauxpilot-triton-1         |             "optional": true
fauxpilot-triton-1         |         },
fauxpilot-triton-1         |         {
fauxpilot-triton-1         |             "name": "repetition_penalty",
fauxpilot-triton-1         |             "data_type": "TYPE_FP32",
fauxpilot-triton-1         |             "format": "FORMAT_NONE",
fauxpilot-triton-1         |             "dims": [
fauxpilot-triton-1         |                 1
fauxpilot-triton-1         |             ],
fauxpilot-triton-1         |             "reshape": {
fauxpilot-triton-1         |                 "shape": []
fauxpilot-triton-1         |             },
fauxpilot-triton-1         |             "is_shape_tensor": false,
fauxpilot-triton-1         |             "allow_ragged_batch": false,
fauxpilot-triton-1         |             "optional": true
fauxpilot-triton-1         |         },
fauxpilot-triton-1         |         {
fauxpilot-triton-1         |             "name": "random_seed",
fauxpilot-triton-1         |             "data_type": "TYPE_INT32",
fauxpilot-triton-1         |             "format": "FORMAT_NONE",
fauxpilot-triton-1         |             "dims": [
fauxpilot-triton-1         |                 1
fauxpilot-triton-1         |             ],
fauxpilot-triton-1         |             "reshape": {
fauxpilot-triton-1         |                 "shape": []
fauxpilot-triton-1         |             },
fauxpilot-triton-1         |             "is_shape_tensor": false,
fauxpilot-triton-1         |             "allow_ragged_batch": false,
fauxpilot-triton-1         |             "optional": true
fauxpilot-triton-1         |         },
fauxpilot-triton-1         |         {
fauxpilot-triton-1         |             "name": "is_return_log_probs",
fauxpilot-triton-1         |             "data_type": "TYPE_BOOL",
fauxpilot-triton-1         |             "format": "FORMAT_NONE",
fauxpilot-triton-1         |             "dims": [
fauxpilot-triton-1         |                 1
fauxpilot-triton-1         |             ],
fauxpilot-triton-1         |             "reshape": {
fauxpilot-triton-1         |                 "shape": []
fauxpilot-triton-1         |             },
fauxpilot-triton-1         |             "is_shape_tensor": false,
fauxpilot-triton-1         |             "allow_ragged_batch": false,
fauxpilot-triton-1         |             "optional": true
fauxpilot-triton-1         |         },
fauxpilot-triton-1         |         {
fauxpilot-triton-1         |             "name": "beam_width",
fauxpilot-triton-1         |             "data_type": "TYPE_UINT32",
fauxpilot-triton-1         |             "format": "FORMAT_NONE",
fauxpilot-triton-1         |             "dims": [
fauxpilot-triton-1         |                 1
fauxpilot-triton-1         |             ],
fauxpilot-triton-1         |             "reshape": {
fauxpilot-triton-1         |                 "shape": []
fauxpilot-triton-1         |             },
fauxpilot-triton-1         |             "is_shape_tensor": false,
fauxpilot-triton-1         |             "allow_ragged_batch": false,
fauxpilot-triton-1         |             "optional": true
fauxpilot-triton-1         |         },
fauxpilot-triton-1         |         {
fauxpilot-triton-1         |             "name": "bad_words_list",
fauxpilot-triton-1         |             "data_type": "TYPE_INT32",
fauxpilot-triton-1         |             "format": "FORMAT_NONE",
fauxpilot-triton-1         |             "dims": [
fauxpilot-triton-1         |                 2,
fauxpilot-triton-1         |                 -1
fauxpilot-triton-1         |             ],
fauxpilot-triton-1         |             "is_shape_tensor": false,
fauxpilot-triton-1         |             "allow_ragged_batch": false,
fauxpilot-triton-1         |             "optional": true
fauxpilot-triton-1         |         },
fauxpilot-triton-1         |         {
fauxpilot-triton-1         |             "name": "stop_words_list",
fauxpilot-triton-1         |             "data_type": "TYPE_INT32",
fauxpilot-triton-1         |             "format": "FORMAT_NONE",
fauxpilot-triton-1         |             "dims": [
fauxpilot-triton-1         |                 2,
fauxpilot-triton-1         |                 -1
fauxpilot-triton-1         |             ],
fauxpilot-triton-1         |             "is_shape_tensor": false,
fauxpilot-triton-1         |             "allow_ragged_batch": false,
fauxpilot-triton-1         |             "optional": true
fauxpilot-triton-1         |         }
fauxpilot-triton-1         |     ],
fauxpilot-triton-1         |     "output": [
fauxpilot-triton-1         |         {
fauxpilot-triton-1         |             "name": "output_ids",
fauxpilot-triton-1         |             "data_type": "TYPE_UINT32",
fauxpilot-triton-1         |             "dims": [
fauxpilot-triton-1         |                 -1,
fauxpilot-triton-1         |                 -1
fauxpilot-triton-1         |             ],
fauxpilot-triton-1         |             "label_filename": "",
fauxpilot-triton-1         |             "is_shape_tensor": false
fauxpilot-triton-1         |         },
fauxpilot-triton-1         |         {
fauxpilot-triton-1         |             "name": "sequence_length",
fauxpilot-triton-1         |             "data_type": "TYPE_UINT32",
fauxpilot-triton-1         |             "dims": [
fauxpilot-triton-1         |                 -1
fauxpilot-triton-1         |             ],
fauxpilot-triton-1         |             "label_filename": "",
fauxpilot-triton-1         |             "is_shape_tensor": false
fauxpilot-triton-1         |         },
fauxpilot-triton-1         |         {
fauxpilot-triton-1         |             "name": "cum_log_probs",
fauxpilot-triton-1         |             "data_type": "TYPE_FP32",
fauxpilot-triton-1         |             "dims": [
fauxpilot-triton-1         |                 -1
fauxpilot-triton-1         |             ],
fauxpilot-triton-1         |             "label_filename": "",
fauxpilot-triton-1         |             "is_shape_tensor": false
fauxpilot-triton-1         |         },
fauxpilot-triton-1         |         {
fauxpilot-triton-1         |             "name": "output_log_probs",
fauxpilot-triton-1         |             "data_type": "TYPE_FP32",
fauxpilot-triton-1         |             "dims": [
fauxpilot-triton-1         |                 -1,
fauxpilot-triton-1         |                 -1
fauxpilot-triton-1         |             ],
fauxpilot-triton-1         |             "label_filename": "",
fauxpilot-triton-1         |             "is_shape_tensor": false
fauxpilot-triton-1         |         }
fauxpilot-triton-1         |     ],
fauxpilot-triton-1         |     "batch_input": [],
fauxpilot-triton-1         |     "batch_output": [],
fauxpilot-triton-1         |     "optimization": {
fauxpilot-triton-1         |         "priority": "PRIORITY_DEFAULT",
fauxpilot-triton-1         |         "input_pinned_memory": {
fauxpilot-triton-1         |             "enable": true
fauxpilot-triton-1         |         },
fauxpilot-triton-1         |         "output_pinned_memory": {
fauxpilot-triton-1         |             "enable": true
fauxpilot-triton-1         |         },
fauxpilot-triton-1         |         "gather_kernel_buffer_threshold": 0,
fauxpilot-triton-1         |         "eager_batching": false
fauxpilot-triton-1         |     },
fauxpilot-triton-1         |     "instance_group": [
fauxpilot-triton-1         |         {
fauxpilot-triton-1         |             "name": "fastertransformer_0",
fauxpilot-triton-1         |             "kind": "KIND_CPU",
fauxpilot-triton-1         |             "count": 1,
fauxpilot-triton-1         |             "gpus": [],
fauxpilot-triton-1         |             "secondary_devices": [],
fauxpilot-triton-1         |             "profile": [],
fauxpilot-triton-1         |             "passive": false,
fauxpilot-triton-1         |             "host_policy": ""
fauxpilot-triton-1         |         }
fauxpilot-triton-1         |     ],
fauxpilot-triton-1         |     "default_model_filename": "codegen-6B-mono",
fauxpilot-triton-1         |     "cc_model_filenames": {},
fauxpilot-triton-1         |     "metric_tags": {},
fauxpilot-triton-1         |     "parameters": {
fauxpilot-triton-1         |         "model_name": {
fauxpilot-triton-1         |             "string_value": "codegen-6B-mono"
fauxpilot-triton-1         |         },
fauxpilot-triton-1         |         "is_half": {
fauxpilot-triton-1         |             "string_value": "1"
fauxpilot-triton-1         |         },
fauxpilot-triton-1         |         "enable_custom_all_reduce": {
fauxpilot-triton-1         |             "string_value": "0"
fauxpilot-triton-1         |         },
fauxpilot-triton-1         |         "vocab_size": {
fauxpilot-triton-1         |             "string_value": "51200"
fauxpilot-triton-1         |         },
fauxpilot-triton-1         |         "tensor_para_size": {
fauxpilot-triton-1         |             "string_value": "2"
fauxpilot-triton-1         |         },
fauxpilot-triton-1         |         "decoder_layers": {
fauxpilot-triton-1         |             "string_value": "33"
fauxpilot-triton-1         |         },
fauxpilot-triton-1         |         "size_per_head": {
fauxpilot-triton-1         |             "string_value": "256"
fauxpilot-triton-1         |         },
fauxpilot-triton-1         |         "max_seq_len": {
fauxpilot-triton-1         |             "string_value": "2048"
fauxpilot-triton-1         |         },
fauxpilot-triton-1         |         "end_id": {
fauxpilot-triton-1         |             "string_value": "50256"
fauxpilot-triton-1         |         },
fauxpilot-triton-1         |         "inter_size": {
fauxpilot-triton-1         |             "string_value": "16384"
fauxpilot-triton-1         |         },
fauxpilot-triton-1         |         "head_num": {
fauxpilot-triton-1         |             "string_value": "16"
fauxpilot-triton-1         |         },
fauxpilot-triton-1         |         "model_type": {
fauxpilot-triton-1         |             "string_value": "GPT-J"
fauxpilot-triton-1         |         },
fauxpilot-triton-1         |         "model_checkpoint_path": {
fauxpilot-triton-1         |             "string_value": "/model/fastertransformer/1/2-gpu"
fauxpilot-triton-1         |         },
fauxpilot-triton-1         |         "rotary_embedding": {
fauxpilot-triton-1         |             "string_value": "64"
fauxpilot-triton-1         |         },
fauxpilot-triton-1         |         "pipeline_para_size": {
fauxpilot-triton-1         |             "string_value": "1"
fauxpilot-triton-1         |         },
fauxpilot-triton-1         |         "start_id": {
fauxpilot-triton-1         |             "string_value": "50256"
fauxpilot-triton-1         |         }
fauxpilot-triton-1         |     },
fauxpilot-triton-1         |     "model_warmup": []
fauxpilot-triton-1         | }
fauxpilot-triton-1         | I1021 20:33:18.059575 88 libfastertransformer.cc:1320] TRITONBACKEND_ModelInstanceInitialize: fastertransformer_0 (device 0)
fauxpilot-triton-1         | W1021 20:33:18.059594 88 libfastertransformer.cc:453] Faster transformer model instance is created at GPU '0'
fauxpilot-triton-1         | W1021 20:33:18.059596 88 libfastertransformer.cc:459] Model name codegen-6B-mono
fauxpilot-triton-1         | W1021 20:33:18.059601 88 libfastertransformer.cc:578] Get input name: input_ids, type: TYPE_UINT32, shape: [-1]
fauxpilot-triton-1         | W1021 20:33:18.059603 88 libfastertransformer.cc:578] Get input name: start_id, type: TYPE_UINT32, shape: [1]
fauxpilot-triton-1         | W1021 20:33:18.059605 88 libfastertransformer.cc:578] Get input name: end_id, type: TYPE_UINT32, shape: [1]
fauxpilot-triton-1         | W1021 20:33:18.059606 88 libfastertransformer.cc:578] Get input name: input_lengths, type: TYPE_UINT32, shape: [1]
fauxpilot-triton-1         | W1021 20:33:18.059608 88 libfastertransformer.cc:578] Get input name: request_output_len, type: TYPE_UINT32, shape: [-1]
fauxpilot-triton-1         | W1021 20:33:18.059609 88 libfastertransformer.cc:578] Get input name: runtime_top_k, type: TYPE_UINT32, shape: [1]
fauxpilot-triton-1         | W1021 20:33:18.059611 88 libfastertransformer.cc:578] Get input name: runtime_top_p, type: TYPE_FP32, shape: [1]
fauxpilot-triton-1         | W1021 20:33:18.059612 88 libfastertransformer.cc:578] Get input name: beam_search_diversity_rate, type: TYPE_FP32, shape: [1]
fauxpilot-triton-1         | W1021 20:33:18.059614 88 libfastertransformer.cc:578] Get input name: temperature, type: TYPE_FP32, shape: [1]
fauxpilot-triton-1         | W1021 20:33:18.059615 88 libfastertransformer.cc:578] Get input name: len_penalty, type: TYPE_FP32, shape: [1]
fauxpilot-triton-1         | W1021 20:33:18.059616 88 libfastertransformer.cc:578] Get input name: repetition_penalty, type: TYPE_FP32, shape: [1]
fauxpilot-triton-1         | W1021 20:33:18.059618 88 libfastertransformer.cc:578] Get input name: random_seed, type: TYPE_INT32, shape: [1]
fauxpilot-triton-1         | W1021 20:33:18.059619 88 libfastertransformer.cc:578] Get input name: is_return_log_probs, type: TYPE_BOOL, shape: [1]
fauxpilot-triton-1         | W1021 20:33:18.059621 88 libfastertransformer.cc:578] Get input name: beam_width, type: TYPE_UINT32, shape: [1]
fauxpilot-triton-1         | W1021 20:33:18.059623 88 libfastertransformer.cc:578] Get input name: bad_words_list, type: TYPE_INT32, shape: [2, -1]
fauxpilot-triton-1         | W1021 20:33:18.059625 88 libfastertransformer.cc:578] Get input name: stop_words_list, type: TYPE_INT32, shape: [2, -1]
fauxpilot-triton-1         | W1021 20:33:18.059628 88 libfastertransformer.cc:620] Get output name: output_ids, type: TYPE_UINT32, shape: [-1, -1]
fauxpilot-triton-1         | W1021 20:33:18.059630 88 libfastertransformer.cc:620] Get output name: sequence_length, type: TYPE_UINT32, shape: [-1]
fauxpilot-triton-1         | W1021 20:33:18.059632 88 libfastertransformer.cc:620] Get output name: cum_log_probs, type: TYPE_FP32, shape: [-1]
fauxpilot-triton-1         | W1021 20:33:18.059634 88 libfastertransformer.cc:620] Get output name: output_log_probs, type: TYPE_FP32, shape: [-1, -1]
fauxpilot-triton-1         | terminate called after throwing an instance of 'std::runtime_error'
fauxpilot-triton-1         |   what():  [FT][ERROR] shared_ft_model->getTensorParaSize() * shared_ft_model->getPipelineParaSize() == world_size Assertion fail: /workspace/build/fastertransformer_backend/src/libfastertransformer.cc:498
fauxpilot-triton-1         |
fauxpilot-triton-1         | [7704449ec6f1:00088] *** Process received signal ***
fauxpilot-triton-1         | [7704449ec6f1:00088] Signal: Aborted (6)
fauxpilot-triton-1         | [7704449ec6f1:00088] Signal code:  (-6)
fauxpilot-triton-1         | [7704449ec6f1:00088] [ 0] /usr/lib/x86_64-linux-gnu/libpthread.so.0(+0x14420)[0x7f3aa66b6420]
fauxpilot-triton-1         | [7704449ec6f1:00088] [ 1] /usr/lib/x86_64-linux-gnu/libc.so.6(gsignal+0xcb)[0x7f3aa50aa00b]
fauxpilot-triton-1         | [7704449ec6f1:00088] [ 2] /usr/lib/x86_64-linux-gnu/libc.so.6(abort+0x12b)[0x7f3aa5089859]
fauxpilot-triton-1         | [7704449ec6f1:00088] [ 3] /usr/lib/x86_64-linux-gnu/libstdc++.so.6(+0x9e911)[0x7f3aa5463911]
fauxpilot-triton-1         | [7704449ec6f1:00088] [ 4] /usr/lib/x86_64-linux-gnu/libstdc++.so.6(+0xaa38c)[0x7f3aa546f38c]
fauxpilot-triton-1         | [7704449ec6f1:00088] [ 5] /usr/lib/x86_64-linux-gnu/libstdc++.so.6(+0xaa3f7)[0x7f3aa546f3f7]
fauxpilot-triton-1         | [7704449ec6f1:00088] [ 6] /usr/lib/x86_64-linux-gnu/libstdc++.so.6(+0xaa6a9)[0x7f3aa546f6a9]
fauxpilot-triton-1         | [7704449ec6f1:00088] [ 7] /opt/tritonserver/backends/fastertransformer/libtriton_fastertransformer.so(+0x2a9a0)[0x7f3a930639a0]
fauxpilot-triton-1         | [7704449ec6f1:00088] [ 8] /opt/tritonserver/backends/fastertransformer/libtriton_fastertransformer.so(+0x1e79f)[0x7f3a9305779f]
fauxpilot-triton-1         | [7704449ec6f1:00088] [ 9] /opt/tritonserver/backends/fastertransformer/libtriton_fastertransformer.so(+0x1fd42)[0x7f3a93058d42]
fauxpilot-triton-1         | [7704449ec6f1:00088] [10] /opt/tritonserver/backends/fastertransformer/libtriton_fastertransformer.so(TRITONBACKEND_ModelInstanceInitialize+0x38c)[0x7f3a9305b63c]
fauxpilot-triton-1         | [7704449ec6f1:00088] [11] /opt/tritonserver/bin/../lib/libtritonserver.so(+0x10c275)[0x7f3aa5958275]
fauxpilot-triton-1         | [7704449ec6f1:00088] [12] /opt/tritonserver/bin/../lib/libtritonserver.so(+0x10d9c3)[0x7f3aa59599c3]
fauxpilot-triton-1         | [7704449ec6f1:00088] [13] /opt/tritonserver/bin/../lib/libtritonserver.so(+0x1019de)[0x7f3aa594d9de]
fauxpilot-triton-1         | [7704449ec6f1:00088] [14] /opt/tritonserver/bin/../lib/libtritonserver.so(+0x1b3b7a)[0x7f3aa59ffb7a]
fauxpilot-triton-1         | [7704449ec6f1:00088] [15] /opt/tritonserver/bin/../lib/libtritonserver.so(+0x1c29a1)[0x7f3aa5a0e9a1]
fauxpilot-triton-1         | [7704449ec6f1:00088] [16] /usr/lib/x86_64-linux-gnu/libstdc++.so.6(+0xd6de4)[0x7f3aa549bde4]
fauxpilot-triton-1         | [7704449ec6f1:00088] [17] /usr/lib/x86_64-linux-gnu/libpthread.so.0(+0x8609)[0x7f3aa66aa609]
fauxpilot-triton-1         | [7704449ec6f1:00088] [18] /usr/lib/x86_64-linux-gnu/libc.so.6(clone+0x43)[0x7f3aa5186133]
fauxpilot-triton-1         | [7704449ec6f1:00088] *** End of error message ***
fauxpilot-triton-1         | --------------------------------------------------------------------------
fauxpilot-triton-1         | Primary job  terminated normally, but 1 process returned
fauxpilot-triton-1         | a non-zero exit code. Per user-direction, the job has been aborted.
fauxpilot-triton-1         | --------------------------------------------------------------------------
fauxpilot-triton-1         | --------------------------------------------------------------------------
fauxpilot-triton-1         | mpirun noticed that process rank 0 with PID 0 on node 7704449ec6f1 exited on signal 6 (Aborted).
fauxpilot-triton-1         | --------------------------------------------------------------------------
fauxpilot-triton-1 exited with code 134

Edit: my first gpu is Intel Xeon integrated graphics, this might not be an usable GPU for fauxpilot since the error

from fauxpilot.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.