This repo aims to compare different deployment methods for a ML from the HuggingFace Hub. We are using bart-large-mnli for our experiments but we expect our results valid for most of the ML models (at least Transformers ones).
Each compared method is in the methods
folder.
For each of the methods, follow those steps:
Go in the method folder
export METHOD_NAME=<name-of-the-method>
cd methods/$METHOD_NAME
Create Conda env
conda create --name ml-deploy-$METHOD_NAME python=3.10
conda activate ml-deploy-$METHOD_NAME
Install dependencies
pip install -r requirements.txt
Download the model
chmod +x setup.sh
./setup.sh
Again from each method folder, you can run:
chmod +x run.sh
./run.sh
This will start the server and you can test the API.
Go back to the root folder of the project:
cd ../..
And you can build the Docker image:
docker build -t $METHOD_NAME -f methods/$METHOD_NAME/Dockerfile .
This will create the docker image.
To run it, simply use:
export PORT_TO_USE=8000
docker run -p $PORT_TO_USE:80 --name $METHOD_NAME $METHOD_NAME
Access it on http://localhost:$PORT_TO_USE/docs.