Giter Club home page Giter Club logo

whisper-api-flask's Introduction

What is Whisper?

Whisper is an automatic State-of-the-Art speech recognition system from OpenAI that has been trained on 680,000 hours of multilingual and multitask supervised data collected from the web. This large and diverse dataset leads to improved robustness to accents, background noise and technical language. In addition, it enables transcription in multiple languages, as well as translation from those languages into English. OpenAI released the models and code to serve as a foundation for building useful applications that leverage speech recognition.

How to start with Docker

  1. First of all if you are planning to run the container on your local machine you need to have Docker installed. You can find the installation instructions here.
  2. Creating a folder for our files, lets call it whisper-api
  3. Create a file called requirements.txt and add flask to it.
  4. Create a file called Dockerfile

In the Dockerfile we will add the following lines:

FROM python:3.10-slim

WORKDIR /python-docker

COPY requirements.txt requirements.txt
RUN apt-get update && apt-get install git -y
RUN pip3 install -r requirements.txt
RUN pip3 install "git+https://github.com/openai/whisper.git" 
RUN apt-get install -y ffmpeg

COPY . .

EXPOSE 5000

CMD [ "python3", "-m" , "flask", "run", "--host=0.0.0.0"]

So what is happening exactly in the Dockerfile?

  1. Choosing a python 3.10 slim image as our base image.
  2. Creating a working directory called python-docker
  3. Copying our requirements.txt file to the working directory
  4. Updating the apt package manager and installing git
  5. Installing the requirements from the requirements.txt file
  6. installing the whisper package from github.
  7. Installing ffmpeg
  8. And exposing port 5000 and running the flask server.

How to create our rout

  1. Create a file called app.py where we import all the necessary packages and initialize the flask app and whisper.
  2. Add the following lines to the file:
from flask import Flask, abort, request
from tempfile import NamedTemporaryFile
import whisper
import torch

# Check if NVIDIA GPU is available
torch.cuda.is_available()
DEVICE = "cuda" if torch.cuda.is_available() else "cpu"

# Load the Whisper model:
model = whisper.load_model("base", device=DEVICE)

app = Flask(__name__)
  1. Now we need to create a route that will accept a post request with a file in it.
  2. Add the following lines to the app.py file:
@app.route("/")
def hello():
    return "Whisper Hello World!"


@app.route('/whisper', methods=['POST'])
def handler():
    if not request.files:
        # If the user didn't submit any files, return a 400 (Bad Request) error.
        abort(400)

    # For each file, let's store the results in a list of dictionaries.
    results = []

    # Loop over every file that the user submitted.
    for filename, handle in request.files.items():
        # Create a temporary file.
        # The location of the temporary file is available in `temp.name`.
        temp = NamedTemporaryFile()
        # Write the user's uploaded file to the temporary file.
        # The file will get deleted when it drops out of scope.
        handle.save(temp)
        # Let's get the transcript of the temporary file.
        result = model.transcribe(temp.name)
        # Now we can store the result object for this file.
        results.append({
            'filename': filename,
            'transcript': result['text'],
        })

    # This will be automatically converted to JSON.
    return {'results': results}

How to run the container?

  1. Open a terminal and navigate to the folder where you created the files.
  2. Run the following command to build the container:
docker build -t whisper-api .
  1. Run the following command to run the container:
docker run -p 5000:5000 whisper-api

If you are having errors on MacOS please add RUN pip3 install markupsafe==2.0.1 to the dockerfile.

How to run the container with Podman:

cd /tmp
git clone https://github.com/lablab-ai/whisper-api-flask whisper
cd whisper
mv Dockerfile Containerfile
podman build --network="host" -t whisper .
podman run --network="host" -p 5000:5000 whisper

Then run:

curl -F "file=@/path/to/filename.mp3" http://localhost:5000/whisper

Also, from the README:

In result you should get a JSON object with the transcript in it.

How to test the API?

  1. You can test the API by sending a POST request to the route http://localhost:5000/whisper with a file in it. Body should be form-data.
  2. You can use the following curl command to test the API:
curl -F "file=@/path/to/file" http://localhost:5000/whisper
  1. In result you should get a JSON object with the transcript in it.

How to deploy the API?

This API can be deployed anywhere where Docker can be used. Just keep in mind that this setup currently using CPU for processing the audio files. If you want to use GPU you need to change Dockerfile and share the GPU. I won't go into this deeper as this is an introduction. Docker GPU

You can find the whole code here

Thank you for reading! If you enjoyed this tutorial you can find more and continue reading on our tutorial page


Artificial Intelligence Hackathons, tutorials and Boilerplates

Join the LabLab Discord

Discord Banner 1
On lablab discord, we discuss this repo and many other topics related to artificial intelligence! Checkout upcoming Artificial Intelligence Hackathons Event

Acclerating innovation through acceleration

whisper-api-flask's People

Contributors

elliotcarey0011 avatar flafi87 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.