Giter Club home page Giter Club logo

ml-serving-stress-test's Introduction

deploy a model using tf serving and stress test it.

This tutorial is based on this medium post https://towardsdatascience.com/use-pre-trained-huggingface-models-in-tensorflow-serving-d2761f7e69f6 which shows how to use huggingface models on tf serving.

I decided to do additional steps apart of the one shown there. I created a golang client to call the model inference and I also stress tested the model on different machines.

This tutorial has the following sections:

  1. Installation instructions
  2. How to get a TF SavedModel
  3. How to serve your model using TF Serving on Docker

Installation instructions

You may have to install tensorflow in your local machine for running tests

Mac and linux with amd architecture

If you are using a linux amd computer you can use conda and install tensorflow and transformers without any problems.

Linux ARM

For mac m1, you have to use miniforge or miniconda. It took some minutes for me to figure out how to install it, you can check the following resources:

https://developer.apple.com/metal/tensorflow-plugin/ https://developer.apple.com/forums/thread/702851 https://jamescalam.medium.com/hugging-face-and-sentence-transformers-on-m1-macs-4b12e40c21ce

What worked for me was to install miniforge with the first link and then I ran the following, make sure that tensorflow-deps and tensorflow macos have the same version :

conda install -c apple tensorflow-deps
python -m pip install tensorflow-macos==2.9
python -m pip install tensorflow-metal==0.5 

Obtain tensorflow saved model.

You can deploy on tensorflow serving by first obtaining a SavedModel, which is a complete tf program, including tf variables and computation, so that you can easily deploy it with tflite, tf.js or tf serving.

Model selection, Sentiment analysis - HuggingFace

I went to check the hugging face models to look for something related with sentiment analysis and went for a simple bert model tuned for sentiment analysis https://huggingface.co/textattack/bert-base-uncased-SST-2. It has .pth checkpoints for pytorch. I selected a simple bert, because I'll be running inference from different clients, including golang and python backends, and I was afraid about not being able to use the huggingface tokenizer on Golang.

Saving a hugging face pytorch model as tf model

As there are no tensorflow checkpoints for this model here, you can save the pytorch model in tensorflow format by running the convert script. Which loads the transformer model from the pytorch weights and then save it was if it were a tensorflow model.

python hugging_face/convert_pytorch_to_tf.py 

Deploy your model

In order to deploy your model, the recommended way is to use docker to run tf serving. My preferred way is by creating my own serving image:

Remember that now is better to keep using amd instead of arm, so move to a linux instance with amd if necessary.

follow steps here: https://github.com/tensorflow/serving/blob/master/tensorflow_serving/g3doc/docker.md#creating-your-own-serving-image

You copy the model and commit

  • docker run -d --name serving_base tensorflow/serving
  • docker cp /bert-base-uncased-SST-2 servig_base
  • docker commit --change "ENV MODEL_NAME bert-base-uncased-SST-2" serving_base bert-base-uncased-SST-2-image

Run a container with your image

  • docker run -p 8501:8501 -t bert-base-uncased-sst2-image

stress test

After installing locust you can run it with the following cmd:

locust --host=http://ec2-3-132-201-36.us-east-2.compute.amazonaws.com:8501 -f inference_clients/locust_client.py

ml-serving-stress-test's People

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.