Giter Club home page Giter Club logo

llm-engine's Introduction

LLM Engine

LICENSE Release Notes CircleCI

๐Ÿš€ The open source engine for fine-tuning and serving large language models. ๐Ÿš€

Scale's LLM Engine is the easiest way to customize and serve LLMs. In LLM Engine, models can be accessed via Scale's hosted version or by using the Helm charts in this repository to run model inference and fine-tuning in your own infrastructure.

๐Ÿ’ป Quick Install

pip install scale-llm-engine

๐Ÿค” About

Foundation models are emerging as the building blocks of AI. However, deploying these models to the cloud and fine-tuning them are expensive operations that require infrastructure and ML expertise. It is also difficult to maintain over time as new models are released and new techniques for both inference and fine-tuning are made available.

LLM Engine is a Python library, CLI, and Helm chart that provides everything you need to serve and fine-tune foundation models, whether you use Scale's hosted infrastructure or do it in your own cloud infrastructure using Kubernetes.

Key Features

๐ŸŽ Ready-to-use APIs for your favorite models: Deploy and serve open-source foundation models โ€” including LLaMA, MPT and Falcon. Use Scale-hosted models or deploy to your own infrastructure.

๐Ÿ”ง Fine-tune foundation models: Fine-tune open-source foundation models on your own data for optimized performance.

๐ŸŽ™๏ธ Optimized Inference: LLM Engine provides inference APIs for streaming responses and dynamically batching inputs for higher throughput and lower latency.

๐Ÿค— Open-Source Integrations: Deploy any Hugging Face model with a single command.

Features Coming Soon

๐Ÿณ K8s Installation Documentation: We are working hard to document installation and maintenance of inference and fine-tuning functionality on your own infrastructure. For now, our documentation covers using our client libraries to access Scale's hosted infrastructure.

โ„ Fast Cold-Start Times: To prevent GPUs from idling, LLM Engine automatically scales your model to zero when it's not in use and scales up within seconds, even for large foundation models.

๐Ÿ’ธ Cost Optimization: Deploy AI models cheaper than commercial ones, including cold-start and warm-down times.

๐Ÿš€ Quick Start

Navigate to Scale Spellbook to first create an account, and then grab your API key on the Settings page. Set this API key as the SCALE_API_KEY environment variable by adding the following line to your .zshrc or .bash_profile:

export SCALE_API_KEY="[Your API key]"

If you run into an "Invalid API Key" error, you may need to run the . ~/.zshrc command to re-read your updated .zshrc.

With your API key set, you can now send LLM Engine requests using the Python client. Try out this starter code:

from llmengine import Completion

response = Completion.create(
    model="falcon-7b-instruct",
    prompt="I'm opening a pancake restaurant that specializes in unique pancake shapes, colors, and flavors. List 3 quirky names I could name my restaurant.",
    max_new_tokens=100,
    temperature=0.2,
)

print(response.output.text)

You should see a successful completion of your given prompt!

What's next? Visit the LLM Engine documentation pages for more on the Completion and FineTune APIs and how to use them. Check out this blog post for an end-to-end example.

llm-engine's People

Contributors

yunfeng-scale avatar ian-scale avatar saiatmakuri avatar squeakymouse avatar seanshi-scale avatar song-william avatar phil-scale avatar ruizehung-scale avatar sam-scale avatar dependabot[bot] avatar tiffzhao5 avatar acmatscale avatar jihan-yin avatar edwardpark97 avatar francesy-scale avatar eltociear avatar jaisanliang avatar ruizehung avatar rkaplan avatar gargutsav avatar yixu34 avatar mfagundo-scale avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.