Giter Club home page Giter Club logo

airtifex's Introduction

AIrtifex

Build Status

Self-hosted, generative AI server and a web app. The API provides the necessary endpoints for interacting with the generative models, while the web app serves as a client-side rendered WASM application for user interaction. The entire project is written in Rust.

Preview GIF

Table of Contents

Prerequisites

To work with this project, you will need the following tools installed:

Setup

  • Clone the repository:
git clone https://github.com/vv9k/airtifex.git
cd airtifex

Getting the weights

This repository doesn't contain any models/weights, you'll need to get them yourself before running the server. Currently supported model types are:

  • Bloom
  • GPT-2
  • GPT-J
  • LLaMa
  • NeoX

For image generation Stable Diffusion models can be used. Below are links to download pretrained weights:

After the models are downloaded, we can specify their location in the configuration.

API Configuration

Below is an example configuration for the server that loads a single 7B Alpaca model for text generation as well as Stable Diffusion v2.1 and v1.5:

---
listen_addr: 127.0.0.1
listen_port: 6901
db_url: sqlite://data.db
#db_url: postgres://airtifex:airtifex@localhost/airtifex
jwt_secret: change-me!

llms:
  - model_path: ./llm_models/ggml-alpaca-7b-q4.bin
    model_description: Alpaca 7B, quantized
    float16: false
    type: LLaMa

stable_diffusion:
  - version: v2.1
    name: sd-v2.1
    model_description: Stable Diffusion v2.1
    clip_weights_path: ./sd_models/clip_v2.1.ot
    vae_weights_path: ./sd_models/vae_v2.1.ot
    unet_weights_path: ./sd_models/unet_v2.1.ot
    vocab_file: ./sd_models/bpe_simple_vocab_16e6.txt

Building and Running the Project

Default username and password to API are both admin.

Running with Docker

The simplest way to run this project is to run it using docker and docker-compose. To do so, run:

make run_docker

This will build the image and run the api and web app in a container behind nginx reverse proxy. The docker-compose.yaml file contains example on how to run the app. It mounts the data directory as a volume which contains the databse file as well as text/image models (you'll have to put the models there or change the source location of the volume before running).

The app will be accessible at http://localhost:8091

The API can also be accessed through the same port like http://localhost:8091/api/v1/llm/inference

API with Sqlite

To build and run the project using SQLite as the database, follow these steps:

  • To run directly use:
# start the server
cd airtifex-api
make serve_release
  • To build use:
cd airtifex-api
make build_release

The binary will be in the target/release directory after the build succeeds.

API with PostgreSQL

To build and run the project using PostgreSQL as the database, follow these steps:

  • Set up a PostgreSQL database and update the db_url field in the API configuration file (e.g., airtifex-api/config.yaml).

  • Run directly:

cd airtifex-api
make serve_release_pg
  • Build the API server with PostgreSQL support:
cd airtifex-api
make build_release_pg

Web App

In another terminal start the web app:

cd airtifex-web
make serve_release

The web app will be accessible at http://localhost:8091 by default and is configured to connect to the API server at localhost:6901. To configure it change the values in the Trunk.toml file.

Systemd Service

Example systemd service for the api server can be found here

Nginx reverse proxy

Example configuration to run behind nginx reverse proxy can be found here

Using the API

The exposed API can be used with any HTTP client. Below are some examples of important endpoints.

Authentication

To use the API, first authenticate with user and password. We will use curl and jq to extract the authentication token and save it to a file. In this example we will authenticate as admin:

❯ curl -H 'Content-Type: application/json' \
     -d '{"username":"admin","password":"admin"}' \
     http://localhost:6901/api/v1/users/login | jq -r .data.token > auth-token
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   330  100   291  100    39   435k  59724 --:--:-- --:--:-- --:--:--  322k

Inference

Request body fields:

{
    prompt: String,
    model: String,
    num_predict: Option<usize>,
    n_batch: Option<usize>,
    top_k: Option<usize>,
    top_p: Option<f32>,
    repeat_penalty: Option<f32>,
    temp: Option<f32>,
    play_back_tokens: Option<bool>,
    save: Option<bool>,
}

Below is an example asking for the capital of France. The response is streamed back with Content-Type: text/event-stream and transfer-encoding: chunked headers.

❯ curl -X POST \
       -N \
       -H 'Content-Type: application/json' \
       -H "Authorization: Bearer $(cat auth-token)" \
       -d '{"prompt": "What is the capital of France?", "model": "ggml-alpaca-7b-q4"}' \
       http://localhost:6901/api/v1/llm/inference

The capital of France is Paris.

Generate Image

Request body schema:

{
    prompt: String,
    model: String,
    input_image: { // This is optional
      data: Vec<u8> // input image
      mask: Option<Vec<u8>> // mask for inpainting
      strength: Option<f64> // how much to keep the original from 0 to 1 (1 meaning replace it fully)
    }
    width: Option<i64>,
    height: Option<i64>,
    n_steps: Option<usize>,
    seed: Option<i64>,
    num_samples: Option<i64>,
    guidance_scale: Option<f64>,
}

Here is a basic example of generating an image from a text prompt providing only the prompt and the model to use (only 1 sample will be generated by default):

❯ curl -X POST \
       -H 'Content-Type: application/json' \
       -H "Authorization: Bearer $(cat auth-token)" \
       -d '{"prompt": "Rusty robot, desert, futuristic", "model": "sd-v2.1"}' \
       http://localhost:6901/api/v1/image/generate
{"status":"success","api_version":"v1","timestamp":"2023-04-27T18:30:51.339581406Z","data":{"image_id":"b1de5a26-79f0-42b2-ac40-8df630cdef1d"}}

This adds the image request to the queue for generation. You can later query to retrieve the image samples:

❯ curl -H "Authorization: Bearer $(cat auth-token)" \
       http://localhost:6901/api/v1/image/b1de5a26-79f0-42b2-ac40-8df630cdef1d/samples
{"status":"success","api_version":"v1","timestamp":"2023-04-27T18:34:17.069607913Z","data":[{"data":[...omited...],"image_id":"b1de5a26-79f0-42b2-ac40-8df630cdef1d","n_sample":1,"sample_id":"45a3fe19-12e5-4a8f-acaa-5b672dec3e60"}]}

License

GPLv3

airtifex's People

Contributors

vv9k avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.