Giter Club home page Giter Club logo

serge's Introduction

Serge - LLaMA made easy ๐Ÿฆ™

License Discord

Serge is a chat interface crafted with llama.cpp for running Alpaca models. No API keys, entirely self-hosted!

  • ๐ŸŒ SvelteKit frontend
  • ๐Ÿ’พ Redis for storing chat history & parameters
  • โš™๏ธ FastAPI + LangChain for the API, wrapping calls to llama.cpp using the python bindings

๐ŸŽฅ Demo:

demo.webm

โšก๏ธ Quick start

๐Ÿณ Docker:

docker run -d \
    --name serge \
    -v weights:/usr/src/app/weights \
    -v datadb:/data/db/ \
    -p 8008:8008 \
    ghcr.io/serge-chat/serge:latest

๐Ÿ™ Docker Compose:

services:
  serge:
    image: ghcr.io/serge-chat/serge:latest
    container_name: serge
    restart: unless-stopped
    ports:
      - 8008:8008
    volumes:
      - weights:/usr/src/app/weights
      - datadb:/data/db/

volumes:
  weights:
  datadb:

Then, just visit http://localhost:8008/, You can find the API documentation at http://localhost:8008/api/docs

๐Ÿ–ฅ๏ธ Windows Setup

Ensure you have Docker Desktop installed, WSL2 configured, and enough free RAM to run models.

โ˜๏ธ Kubernetes & Docker Compose Setup

Instructions for setting up Serge on Kubernetes can be found in the wiki.

๐Ÿง  Supported Models

We currently support the following models:

  • Airoboros ๐ŸŽˆ
    • Airoboros-7B
    • Airoboros-13B
    • Airoboros-30B
  • Alpaca ๐Ÿฆ™
    • Alpaca-LoRA-65B
    • GPT4-Alpaca-LoRA-30B
  • Chronos ๐ŸŒ‘
    • Chronos-13B
    • Chronos-33B
  • GPT4All ๐ŸŒ
    • GPT4All-13B
  • Guanaco ๐Ÿฆ™
    • Guanaco-7B
    • Guanaco-13B
    • Guanaco-33B
    • Guanaco-65B
  • Koala ๐Ÿจ
    • Koala-7B
    • Koala-13B
  • Llama ๐Ÿฆ™
    • FinLlama-33B
    • Llama-Supercot-30B
  • Lazarus ๐Ÿ’€
    • Lazarus-30B
  • Nous ๐Ÿง 
    • Nous-Hermes-13B
  • OpenAssistant ๐ŸŽ™๏ธ
    • OpenAssistant-30B
  • Samantha ๐Ÿ‘ฉ
    • Samantha-7B
    • Samantha-13B
    • Samantha-33B
  • Tulu ๐ŸŽš
    • Tulu-7B
    • Tulu-13B
    • Tulu-30B
  • Vicuna ๐Ÿฆ™
    • Stable-Vicuna-13B
    • Vicuna-CoT-7B
    • Vicuna-CoT-13B
    • Vicuna-v1.1-7B
    • Vicuna-v1.1-13B
    • VicUnlocked-30B
    • VicUnlocked-65B
  • Wizard ๐Ÿง™
    • Wizard-Mega-13B
    • Wizard-Vicuna-Uncensored-7B
    • Wizard-Vicuna-Uncensored-13B
    • Wizard-Vicuna-Uncensored-30B
    • WizardLM-30B
    • WizardLM-Uncensored-7B
    • WizardLM-Uncensored-13B
    • WizardLM-Uncensored-30B

Additional weights can be added to the serge_weights volume using docker cp:

docker cp ./my_weight.bin serge:/usr/src/app/weights/

โš ๏ธ Memory Usage

LLaMA will crash if you don't have enough available memory for the model:

Model RAM Required
7B 4.5GB
7B-q6_K 8.03GB
13B 12GB
13B-q6_K 13.18GB
30B 20GB
30B-q6_K 29.19GB

๐Ÿ’ฌ Support

Need help? Join our Discord

๐Ÿค Contributing

If you discover a bug or have a feature idea, feel free to open an issue or PR.

To run Serge in development mode:

git clone https://github.com/serge-chat/serge.git
DOCKER_BUILDKIT=1 docker compose -f docker-compose.dev.yml up -d --build

serge's People

Contributors

dependabot[bot] avatar nsarrazin avatar gaby avatar pabl-o-ce avatar rakete avatar snxraven avatar axolotlite avatar agronholm avatar fenarksec avatar jsonsmth avatar johncadengo avatar justinguese avatar louisoutin avatar mavaa avatar steelalloy avatar migelo avatar noproto avatar paraskevasleivadaros avatar thomasleveil avatar robotdjman avatar security-companion avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.