Giter Club home page Giter Club logo

paddlepaddle_backend's Introduction

Triton Paddle Backend

Table of Contents

Quick Start

Build Paddle

Paddle backend requires paddle inference API, so it is necessary to have paddle inference lib.

Use build_paddle.sh to build paddle inference lib and headers. This step may takes lots of time.

$ cd paddle-lib
$ bash build_paddle.sh
$ cd .. # back to root of paddle_backend

After paddle is successfully built, please check a directory called paddle is under paddle-lib directory.

Build Paddle backend

Build libtriton_paddle.so by scripts/build_paddle_backend.sh

$ bash scripts/build_paddle_backend.sh

Create A Model Repository

The model repository is the directory where you place the models that you want Triton to server. An example model repository is included in the examples. Before using the repository, you must fetch it by the following scripts.

$ cd examples
$ ./fetch_models.sh
$ cd .. # back to root of paddle_backend

Launch Triton Inference Server

Launch triton inference server with single GPU, you can change any docker related configurations in scripts/launch_triton_server.sh if necessary.

$ bash scripts/launch_triton_server.sh

Verify Triton Is Running Correctly

Use Triton’s ready endpoint to verify that the server and the models are ready for inference. From the host system use curl to access the HTTP endpoint that indicates server status.

$ curl -v localhost:8000/v2/health/ready
...
< HTTP/1.1 200 OK
< Content-Length: 0
< Content-Type: text/plain

The HTTP request returns status 200 if Triton is ready and non-200 if it is not ready.

Examples

Before running the examples, please make sure the triton server is running correctly.

Change working directory to examples and download the data

$ cd examples
$ ./fetch_perf_data.sh # download benchmark input

ERNIE Base

ERNIE-2.0 is a pre-training framework for language understanding.

Steps to run the benchmark on ERNIE

$ bash perf_ernie.sh

ResNet50 v1.5

The ResNet50-v1.5 is a modified version of the original ResNet50 v1 model.

Steps to run the benchmark on ResNet50-v1.5

$ bash perf_resnet50_v1.5.sh

Steps to run the inference on ResNet50-v1.5.

  1. Prepare processed images following DeepLearningExamples and place imagenet folder under examples directory.

  2. Run the inference

$ bash infer_resnet_v1.5.sh imagenet/<id>

Performance

ERNIE Base (T4)

Precision Backend Accelerator Client Batch Size Sequences/second P90 Latency (ms) P95 Latency (ms) P99 Latency (ms) Avg Latency (ms)
FP16 TensorRT 1 270.0 3.813 3.846 4.007 3.692
FP16 TensorRT 2 500.4 4.282 4.332 4.709 3.980
FP16 TensorRT 4 831.2 5.141 5.242 5.569 4.797
FP16 TensorRT 8 1128.0 7.788 7.949 8.255 7.089
FP16 TensorRT 16 1363.2 12.702 12.993 13.507 11.738
FP16 TensorRT 32 1529.6 22.495 22.817 24.634 20.901

ResNet50 v1.5 (V100-SXM2-16G)

Precision Backend Accelerator Client Batch Size Sequences/second P90 Latency (ms) P95 Latency (ms) P99 Latency (ms) Avg Latency (ms)
FP16 TensorRT 1 288.8 3.494 3.524 3.608 3.462
FP16 TensorRT 2 494.0 4.083 4.110 4.208 4.047
FP16 TensorRT 4 758.4 5.327 5.359 5.460 5.273
FP16 TensorRT 8 1044.8 7.728 7.770 7.949 7.658
FP16 TensorRT 16 1267.2 12.742 12.810 13.883 12.647
FP16 TensorRT 32 1113.6 28.840 29.044 30.357 28.641
FP16 TensorRT 64 1100.8 58.512 58.642 59.967 58.251
FP16 TensorRT 128 1049.6 121.371 121.834 123.371 119.991

ResNet50 v1.5 (T4)

Precision Backend Accelerator Client Batch Size Sequences/second P90 Latency (ms) P95 Latency (ms) P99 Latency (ms) Avg Latency (ms)
FP16 TensorRT 1 291.8 3.471 3.489 3.531 3.427
FP16 TensorRT 2 466.0 4.323 4.336 4.382 4.288
FP16 TensorRT 4 665.6 6.031 6.071 6.142 6.011
FP16 TensorRT 8 833.6 9.662 9.684 9.767 9.609
FP16 TensorRT 16 899.2 18.061 18.208 18.899 17.748
FP16 TensorRT 32 761.6 42.333 43.456 44.167 41.740
FP16 TensorRT 64 793.6 79.860 80.410 80.807 79.680
FP16 TensorRT 128 793.6 158.207 158.278 158.643 157.543

paddlepaddle_backend's People

Contributors

zlsh80826 avatar jiweibo avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.