Giter Club home page Giter Club logo

Comments (5)

szalpal avatar szalpal commented on May 30, 2024

@uefall ,

thank you for pointing that out! Could you tell, which version of dali_backend are you running? #16 fixed issues with choosing the device, this PR will be available in tritonserver-21.02. In case you'd like to use the main branch of dali_backend, please follow docker build instructions.

If you are using main branch and you still observe CPU allocation of DALI output, could you run the server with --log_verbose=2 and check the instance groups in full DALI model configuration logged? Here's how it should look like:

I0208 23:00:49.566997 1 dali_backend.cc:71] Loading DALI pipeline from file /models/dali/1/model.dali
I0208 23:00:49.567063 1 dali_backend.cc:44] model configuration:
{
    "name": "dali",
    "platform": "",
    "backend": "dali",
[...]
   "instance_group": [
       {
           "name": "dali",
           "kind": "KIND_GPU",
           "count": 1,
            "gpus": [
                0
            ],
            "profile": []
        }
    ],
[...]
}

Anyway, your proposal to change the default requested memory type is good. It's available in #21

from dali_backend.

uefall avatar uefall commented on May 30, 2024

sorry for my late reply.
I use the version

commit 076e98841c976e0f6c55fc360431cfb5bfd7f485 (HEAD -> main, origin/r21.02)
Author: Michał <[email protected]>
Date:   Tue Jan 26 02:43:57 2021 +0100

the log-verbose2 shows

I0220 02:03:17.687831 1 dali_backend.cc:71] Loading DALI pipeline from file /models/dali_ctdet/1/model.dali
I0220 02:03:17.687940 1 dali_backend.cc:44] model configuration:
{
    "name": "dali_ctdet",
    "platform": "",
    "backend": "dali",
    "version_policy": {
        "latest": {
            "num_versions": 1
        }
    },
    "max_batch_size": 128,
    "input": [
        {
            "name": "DALI_INPUT_0",
            "data_type": "TYPE_UINT8",
            "format": "FORMAT_NONE",
            "dims": [
                -1
            ],
            "is_shape_tensor": false,
            "allow_ragged_batch": false
        }
    ],
    "output": [
        {
            "name": "DALI_OUTPUT_0",
            "data_type": "TYPE_FP32",
            "dims": [
                3,
                512,
                320
            ],
            "label_filename": "",
            "is_shape_tensor": false
        },
        {
            "name": "DALI_OUTPUT_1",
            "data_type": "TYPE_INT64",
            "dims": [
                3
            ],
            "label_filename": "",
            "is_shape_tensor": false
        }
    ],
    "batch_input": [],
    "batch_output": [],
    "optimization": {
        "priority": "PRIORITY_DEFAULT",
        "input_pinned_memory": {
            "enable": true
        },
        "output_pinned_memory": {
            "enable": true
        }
    },
    "instance_group": [
        {
            "name": "dali_ctdet_0",
            "kind": "KIND_GPU",
            "count": 1,
            "gpus": [
                0
            ],
            "profile": []
        }
    ],
    "default_model_filename": "",
    "cc_model_filenames": {},
    "metric_tags": {},
    "parameters": {},
    "model_warmup": []
}
I0220 02:03:17.688125 1 dali_backend.cc:348] TRITONBACKEND_ModelInstanceInitialize: dali_ctdet_0 (GPU device 0)

and dali output is still memory type 1 and 2

I0220 02:08:46.983816 1 infer_response.cc:165] add response output: output: DALI_OUTPUT_0, type: FP32, shape: [64,3,512,320]
I0220 02:08:46.983851 1 pinned_memory_manager.cc:131] pinned memory allocation: size 125829120, addr 0x7feb2e000090
I0220 02:08:46.983855 1 ensemble_scheduler.cc:509] Internal response allocation: DALI_OUTPUT_0, size 125829120, addr 0x7feb2e000090, memory type 1, type id 0
I0220 02:08:46.983860 1 infer_response.cc:165] add response output: output: DALI_OUTPUT_1, type: INT64, shape: [64,3]
I0220 02:08:46.983864 1 ensemble_scheduler.cc:509] Internal response allocation: DALI_OUTPUT_1, size 1536, addr 0x7feb2a000000, memory type 2, type id 0
I0220 02:08:46.983931 1 ensemble_scheduler.cc:524] Internal response release: size 125829120, addr 0x7feb2e000090
I0220 02:08:46.983936 1 ensemble_scheduler.cc:524] Internal response release: size 1536, addr 0x7feb2a000000

I will update to the latest and try again, thank you.

from dali_backend.

uefall avatar uefall commented on May 30, 2024

I update the code and test again, got same result,
DALI_OUTPUT_0 is decoded image data DALI_OUTPUT_1 is image shape,
why it made DALI_OUTPUT_1 to GPU and DALI_OUTPUT_0 still remain CPU?

I remove the image shape output and left only image data output, still memory type 1

I0220 06:30:07.825872 1 pinned_memory_manager.cc:131] pinned memory allocation: size 125829120, addr 0x7f664e000090
I0220 06:30:07.825877 1 ensemble_scheduler.cc:509] Internal response allocation: DALI_OUTPUT_0, size 125829120, addr 0x7f664e000090, memory type 1, type id 0
I0220 06:30:07.825922 1 ensemble_scheduler.cc:524] Internal response release: size 125829120, addr 0x7f664e000090

from dali_backend.

uefall avatar uefall commented on May 30, 2024

tested ok with the latest version, maybe I forgot to change the container last time.
I will close this issue.
@szalpal ,Thank you!

from dali_backend.

uefall avatar uefall commented on May 30, 2024

The problem is that I use a large batchsize to do perf test and the DALI_OUTPUT exceed the cuda memory limit

W0222 10:25:26.739351 1 memory.cc:135] Failed to allocate CUDA memory with byte size 251658240 on GPU 0: CNMEM_STATUS_OUT_OF_MEMORY, falling back to pinned system memory

this info shows only once and i missed it.

from dali_backend.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.