Giter Club home page Giter Club logo

llm_appdev-handson's Introduction

LLM App Dev Workshop

Introduction

a bunch of happy local llamas

This repository demonstrates how to build a simple LLM-based chatbot that can answer questions based on your documents (retrieval augmented generation - RAG) and how to deploy it using Podman or on the OpenShift Container Platform (k8s).

The corresponding workshop - first run at Red Hat Developers Hands-On Day 2023 in Darmstadt, Germany - teaches participants the basic concepts of LLMs & RAG, and how to adapt this example implementation to their own specific purpose GPT.

The software stack only uses open source tools streamlit, LlamaIndex and local open LLMs via Ollama. Real open AI for the GPU poor.

Everyone is invited to fork this repository, create their own specific purpose chatbot based on their documents, improve the setup or even hold your own workshop.

Setup

For the local setup a Mac M1 with 16GB unified memory and above are recommended. First download Ollama from ollama.ai and install it.

On Linux you can disable the Ollama service for better debugging:

sudo systemctl disable ollama
sudo systemctl stop ollama

and then manually run ollama serve.

For the local example have a look at the folder streamlit and install the requirements:

pip install -r requirements.txt

Then start streamlit with:

streamlit run app.py

Modify the system prompt and copy different data sources to docs/ in order to create your own version of the chatbot. You can set the ollama host via the enviroment variable OLLAMA_HOST.

You can download models locally with ollama pull zephyr or via API:

curl -X POST http://ollama:11434/api/pull -d '{"name": "zephyr"}'

First start the ollama service as described and download the Zephyr model. To test the ollama server you can call the generate API:

curl -X POST http://ollama:11434/api/generate -d '{"model": "zephyr", "prompt": "Why is the sky blue?"}'

All of these commands are also documented in our cheat sheet.

Deployment

Podman

Build the container based on UBI9 Python 3.11:

podman build -t linuxbot-app .

If you're building on arm64 Mac and deploy on amd64 then generally don't forget to add --platform (in this case our base image is amd64 anyways):

podman build --platform="linux/amd64" -t linuxbot-app .

We will create a network for our linuxbot and ollama:

podman network create linuxbot

Check if DNS is enabled (it's not on the default net):

podman network inspect linuxbot

Now you can either start Ollama locally with ollama serve or start a Ollama container with

podman run --net linuxbot --name ollama -p 11434:11434 --rm docker.io/ollama/ollama:latest

Note: We just forward the port so we can curl it more easily locally as well.

This ollama service won't have GPU support enabled and much slower compared to running it locally on a Mac M1 for example.

Since we create the embeddings locally in the streamlit app we need to increase shared memory for Pytorch in order to get it running:

podman run --net linuxbot --name linuxbot-app -p 8080:8080 --shm-size=2gb -e OLLAMA_HOST=ollama -it --rm localhost/linuxbot-app

You can set the Ollama server via the environment variable OLLAMA_HOST, the default is localhost.

NOTE: It would be much better to generate the embeddings with the ollama service, this is not yet supported in LlamaIndex though.

OpenShift

Create a new project (namespace) for your workshop and deploy the ollama service in it:

oc new-project my-workshop
oc apply -f deployments/ollama.yaml

If you want to enable GPU support you have to have to install and instantiate the NVIDIA GPU Operator and Node Feature Discovery (NFD) Operator as described on the AI on OpenShift page, then deploy ollama-gpu.yaml instead.

oc apply -f deployments/ollama-gpu.yaml

The streamlit application (linuxbot) can deployed as:

oc apply -f deployments/linuxbot.yaml

We have published a preconfigured container image on quay.io/sroecker that is used in this deployment.

In order to debug your application and ollama service you can deploy a curl image like this:

oc run mycurl --image=curlimages/curl -it -- sh
oc attach mycurl -c mycurl -i -t
oc delete pod mycurl

References

llm_appdev-handson's People

Contributors

sroecker avatar kirigould avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.