Giter Club home page Giter Club logo

helixml / helix Goto Github PK

View Code? Open in Web Editor NEW
208.0 5.0 15.0 37.15 MB

Multi-node production AI stack. Run the best of open source AI easily on your own servers. Create your own AI by fine-tuning open source models. Integrate LLMs with APIs. Run gptscript securely on the server

Home Page: https://tryhelix.ai

License: Other

Shell 0.62% Go 54.98% TypeScript 41.22% HTML 0.15% Dockerfile 0.14% Python 2.57% Mako 0.05% Smarty 0.27%
golang llama llm mistral openai self-hosted mixtral sdxl stable-diffusion api

helix's People

Contributors

bigadamknight avatar binocarlos avatar chocobar avatar lukemarsden avatar obianuoobi avatar philwinder avatar rusenask avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

helix's Issues

non-english language qapairs

currently the qapair thing seems to translate non-english input data to english, however we have users who want to be able to do it all in, say, french

when working get back to french user on crisp

Jsonl input data

If the user uploads their own qapairs, skip the qapair generation phase

Multi GPU support

Support multiple GPUs on a single node. Initially we can workaround this just by running N runners with CUDA_VISIBLE_DEVICES passed through to the runner python processes

show API calls to replicate many actions

(e.g. text & image inference to start with)

basically show the curl equivalent of the UI action - i.e. make it clear that you can use the API for each of these actions

the session page scrolls to the bottom randomly

There is some useMemo that is reloading (possibly from keycloak) that is causing the "the session has changed scroll to the bottom" behaviour even when the session clearly has not changed - it's annoying because you are actively scrolling up and down just reading and then it will just jump to the bottom of the page

too few questions in small dataset

If you put a small bit of text like:

Bob lives at 6 Crow Terrace

It will generate a single question / answer pair and then axotl complains there are too few questions in the training data set

Model seems obsessed with more fine tuning of dataset

Having submitted a document (random doc, outline of a fictional story), then asking what a character should do in the story, I keep being met with "Character should continue fine-tuning the data to improve the accuracy of the model." This seemed to be an inescapable answer, no matter how I posed the question.

It also does not appear to learn from any further conversation I have after the dataset is submitted.

switch to isStale everywhere

the logic for whether a model instance is stale is currently in 3 places (search for stale := and nonStale :=)

move it to one

url box mime type detection

if you put a URL to a file in the URL box - detect the bloody mime type so we don't split docs that are downloaded

the URL box should download files first

Old list "done"

Things we did whilst using the old "list"

  • when continuing a cloned session, the messages are missing
  • if there are no files - the "view files" button shows an error
  • "add new documents" button at bottom of text session (add more documents, dataprep new ones into jsonl qa-pairs, concatenate qa-pairs, retrain model)
  • retry button for errors
  • plugin sentry
  • share mode where original training data is not copied
  • auto-scroll broken
  • put the name of the session in topbar
  • rather than system as the username, put the name of the session
  • sessions are updating other sessions https://mlops-community.slack.com/archives/C0675EX9V2Q/p1702476943225859
  • add a restart button whilst doing a fine-tune so if things get stuck we can restart
    • possibly only show this if we've not seen any progress for > 30 seconds (fixed by the error throwing an error if runner reports job still active)
  • dashboard not showing finetune interactions
  • performance of auto-save before login (image fine tune text is slow)
  • for session updates check we are on the same page
    • whilst we are on one page and another session is processing - it's updating the page we are on with the wrong session
  • react is rendering streaming updates to the sessions slowly
  • progress bars on text fine tuning
  • fork session (fork from an interaction)
  • add data after the model is trained
  • pdfs are broken in production
  • for HTML conversion, use pupetteer to render the page into a PDF then convert the PDF into plain text
  • reliable and fast, scale to 5 concurrent users (Luke)
    • Dockerize the runner & deploy some on vast.ai / runpod.io
  • finish and deploy dashboard
  • logged out state when trying to do things - show a message "please register"
  • fix bug with "create image" dropdown etc not working
  • fix bug with openAI responding with "GPT 4 Answer: Without providing a valid context, I am unable to generate 50 question and answer pairs as requested"
    • make it so user can see whole message from OpenAI
  • replace the thinking face with a spinning progress (small horizontal bouncing three dots)
  • there is a dashboard bug where where runner model job history reverses itself
  • you lose keyboard focus when the chat box disables and re-enables
  • make the chatbox have keyboard focus the first time you load the page
  • pasting a long chunk of text into training text box makes the box go taller than the screen and you cannot scroll
  • create images says “chat with helix” should say “describe what you want to see in an image”
  • enforce min-width on left sidebar
  • the event cancel handler on drop downs is not letting you click the same mode
  • hide technical details behind "technical details" button ?
    • where it currently says "Session ...." - put the session title
    • put a link next to "View Files" called "Info" that will open a model window with more session details
    • e.g. we put the text summary above in the model along with the ID and other things we want to show
    • in the text box say "Chat with Helix" <- for txt models
    • in the text box say "Make images with Helix" <- for image models
  • edit session name (pencil icon to left of bin icon)
  • obvious buttons (on fine tuning)
    • in default starting state - make both buttons (add docs / text) - blue and outlined
    • the the default starting state - make the files button say "or choose files"
    • when you start typing in the box make the "Add Text" button pink and make the upload files not pink
    • once there are > 0 files - make the "choose more files" button outlined so the "upload docs" is the main button
  • performance on text fine tuning (add concurrency to openAI calls)
  • URL to fetch text for text fine tuning
  • homepage uncomment buttons
  • re-train, will add more interactions to add files to
  • we should keep previous Lora files at the interaction level
  • we hoist lora_dir from the latest interaction to the session

place in the queue indiciation

if it's more than 5 seconds

we already have the "this is taking a while" window - this is to show the place in the queue also

new activity dot

show a dot next to sessions that are currently active or have new replies

check URL type

make it clear that URLs need to be of text content - for example a youtube URL will not work

Use huggingface tokenizer chat template for inference

In the llm model go code (e.g. here) we build up a prompt that is a formatted string based on the chat template associated with the model.

We could instead store a generic json-ised version of the chat history in task.Prompt, like:

[{"role": "user", "content": "What's the capital of France'?"}, {"role": "assistant", "content": "It's Paris."}]

and the use the model's tokenizer to format the message for us inside axolotl at inference time:

messages = json.loads(json_messages)
tokenizer = AutoTokenizer.from_pretrained(model_name)
encoded_messages = tokenizer.apply_chat_template(new_messages, tokenize=False)

This will reduce the effort needed to add subsequent models with potentially different chat templates.

scheduler not hitting spun up model

quite often there's a model ready to serve and a new one gets spun up on the other node - maybe the clocks are drifting between the machines so the 2 second head start doesn't work? or the python processes aren't polling every 100ms or something?

fine tuning hangs

why? do we need to automatically restart things if they haven't started in a timeout?

analyse all sessions in the database

for each one:

  • are there errors? if so, add issues to github. calculate which issues caused the most errors
  • is there a trained model with no interactions? if so add to chris's spreadsheet and ping him. also, #43
  • were they successful at doing anything?

overall: what % of sessions were successful and what were the biggest pain points? categorise the use cases

url error reporting

detect when we did not manage to extract any text and tell the user that is the error

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.