Giter Club home page Giter Club logo

Comments (18)

jywu-msft avatar jywu-msft commented on May 27, 2024

there are various strategies for reducing the session initialization time. we're in the process of putting together a doc to provide guidance.
+@chilo-ms

from onnxruntime.

hy846130226 avatar hy846130226 commented on May 27, 2024

Hi @jywu-msft
Thanks! It is very helpful if we have such a document.

from onnxruntime.

hy846130226 avatar hy846130226 commented on May 27, 2024

I have read the source code and found this operation cost much time.
Could someone tell me why? Is the onnx do something optimze in the model?

image

from onnxruntime.

hy846130226 avatar hy846130226 commented on May 27, 2024

Oh I found the main place to spend time.
It's here:
image
it seems the onnx is loading the tensorrt ep.

How did they do it?
By reflexing the dll? or something?
Why it cost so much time?

from onnxruntime.

jywu-msft avatar jywu-msft commented on May 27, 2024

there are 2 areas which cost the most time during tensorrt EP initialization.

  1. TensorRT builder instantiation. here it loads a DLL with tensorrt kernels.
  2. TensorRT engine build. (this can take the most time because it is doing kernel auto-tuning, where it measures timings for different kernels/tactics.
    For 2), there is an option to enable serializing a built engine to disk so that you don't need to rebuild it next time you initialize a session. the option is trt_engine_cache_enable , can you try it?
    to avoid 1) is a little more complicated. if 2) is enough, then you can try that first.
    @chilo-ms to add more comments.

from onnxruntime.

hy846130226 avatar hy846130226 commented on May 27, 2024

Hi @jywu-msft

I see 2)
And I found even I use the trt_engine_cache_enable, it still cost time, but it indeed cost shorter.
Because it generate the trt IBuilder, it cost some time, but as my knowledage, If I had an off-the-shelf trt model,I just need IRuntime

Like this:
image

So why the onnx-trt not check if enable the trt_engine_cache_enable, if it does, do not load the IBuilder?

from onnxruntime.

hy846130226 avatar hy846130226 commented on May 27, 2024

And about 1)

I think it is indeed not easy.
Can you roughly describe the process for me? I'm having a bit of trouble understanding the code, so if you could that would be greatly appreciated!

from onnxruntime.

chilo-ms avatar chilo-ms commented on May 27, 2024

So why the onnx-trt not check if enable the trt_engine_cache_enable, if it does, do not load the IBuilder?

ORT TRT has this similar feature (starts from 1.17.0) which skips TRT builder instantiation and simply deserializes engine cache to run inference.

However, we still need an "ONNX" model to start with. So, ORT TRT helps user create the "embed engine" model which is basically an ONNX model contains only one node that wraps the engine cache.
Run this embed engine model to skip those lengthy processes such as TRT builder instantiation.

Please see below the highlighted part to know how to use ORT TRT provider options to generate/run embed engine model.
image

BTW, we are working on documenting the usage of embed engine model.
Also note that there are constraints using it, such as

  • whole model should be TRT eligible.
  • It supports dynamic shape input only when user explicit specifies the shape range meaning engine won't be rebuilt for all the inference runs.

from onnxruntime.

hy846130226 avatar hy846130226 commented on May 27, 2024

Hi @chilo-ms,

I try to use the trt_dumo_ep_context_model like following:

image

But I got error:
[ONNXRuntimeError] : 1 : FAIL : provider_options_utils.h:148 onnxruntime::ProviderOptionsParser::Parse Unknown provider option: "trt_dump_ep_context_model".

from onnxruntime.

hy846130226 avatar hy846130226 commented on May 27, 2024

And I try to modify the source code simiply, I comments the filds about IBuilder, INetworkDefinition, IParser.

image
image

I found it could still work.

This is a simply version, I know.

I will continue to debug if this way will cause something errors, also, I want to know if I have a tensorrt model in trt_engine_cache_path, and enable the trt_engine_cache_enable, I do not initialize IBuilder, is this way correct?

from onnxruntime.

hy846130226 avatar hy846130226 commented on May 27, 2024

And I try to modify the source code simiply, I comments the filds about IBuilder, INetworkDefinition, IParser.

image image

I found it could still work.

This is a simply version, I know.

I will continue to debug if this way will cause something errors, also, I want to know if I have a tensorrt model in trt_engine_cache_path, and enable the trt_engine_cache_enable, I do not initialize IBuilder, is this way correct?

I think if I comment those fields about IBuilder, INetworkDefinition, IParser, so that the outside could not get the associated object, it also could prove that the outside does not use those objects, right?

from onnxruntime.

chilo-ms avatar chilo-ms commented on May 27, 2024

Hi @chilo-ms,

I try to use the trt_dumo_ep_context_model like following:

image

But I got error: [ONNXRuntimeError] : 1 : FAIL : provider_options_utils.h:148 onnxruntime::ProviderOptionsParser::Parse Unknown provider option: "trt_dump_ep_context_model".

What ORT version are you using?
Please use 1.17.0 or above or main branch.

from onnxruntime.

chilo-ms avatar chilo-ms commented on May 27, 2024

And I try to modify the source code simiply, I comments the filds about IBuilder, INetworkDefinition, IParser.

image image

I found it could still work.

This is a simply version, I know.

I will continue to debug if this way will cause something errors, also, I want to know if I have a tensorrt model in trt_engine_cache_path, and enable the trt_engine_cache_enable, I do not initialize IBuilder, is this way correct?

Your idea is basically right.
Please see the ORT TRT code (here and here) in main branch.

In additions to the code path (in EP Compile) you found that it involves builder instantiation, there is also builder instantization in the EP GetCapability. So that's why we need the "Embed Engine" model to skip builder instantization.

from onnxruntime.

hy846130226 avatar hy846130226 commented on May 27, 2024

Hi @chilo-ms

Thanks for your reply very much!

I will try to remove the process of generating the IBuilder if it already genearta model.

And about the EP GetCapabnility, I also have a question, and here is the link:
#20029

"So that's why we need the "Embed Engine" model to skip builder instantization."I do not know why the EP GetCapability method need to genearte IBuilder Object, as my knowledage, the IBuilder is used to generate some trt objects, such as the INetworkDefinition.

And if I already have a trt model from onnx, could I skip this step in process?

from onnxruntime.

hy846130226 avatar hy846130226 commented on May 27, 2024

Hi @chilo-ms,
I try to use the trt_dumo_ep_context_model like following:
image
But I got error: [ONNXRuntimeError] : 1 : FAIL : provider_options_utils.h:148 onnxruntime::ProviderOptionsParser::Parse Unknown provider option: "trt_dump_ep_context_model".

What ORT version are you using? Please use 1.17.0 or above or main branch.

Yes my version is 1.16.3.

Because at first, I download your 1.17.0 or 1.17.3 packages, there is no dll in it.
So I use the 1.16.3.

Why the newest packages in nuget don't have dll?

Also I will use the newest code to build the dll.

from onnxruntime.

jywu-msft avatar jywu-msft commented on May 27, 2024

use the 1.17.1 nuget package.
there are multiple packages.
i.e. Microsoft.ML.Onnxruntime.Gpu depends on Microsoft.ML.OnnxRuntime.Gpu.Windows
and in that package are the onnxruntime .dll's

from onnxruntime.

hy846130226 avatar hy846130226 commented on May 27, 2024

Hi @jywu-msft

I try to use the 1.17.1 Microsoft.ML.Onnxruntime.Gpu depends on Microsoft.ML.OnnxRuntime.Gpu.Windows.

But I got the error:
image

I check the structure of 1.17.1 package, I found that the directory was "buildTransitive" not "build", it cause that the vs could not load the props,targets files.

image

I feel confused, am I missing something?

from onnxruntime.

chilo-ms avatar chilo-ms commented on May 27, 2024

"So that's why we need the "Embed Engine" model to skip builder instantization."I do not know why the EP GetCapability method need to genearte IBuilder Object, as my knowledage, the IBuilder is used to generate some trt objects, such as the INetworkDefinition.

And if I already have a trt model from onnx, could I skip this step in process?

Because TRT parser needs TRT networks which depends on TRT builder.
https://github.com/microsoft/onnxruntime/blob/main/onnxruntime/core/providers/tensorrt/tensorrt_execution_provider.cc#L2082

If you have TRT engine cache, you still need the embed engine model to skip the process for now.
Please see the embed engine model (EPContext node model) to skip the whole GetCapability.
Here are two PRs which introduces embed engine model feature.
#18217
#19154
But, we are working on another PR that can skip GetCapability without using the embed engine model but simply with engine cahce. (This is the exact feature that you want)

Also, I'm working on the document for users to better understand this feature.

from onnxruntime.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.