Giter Club home page Giter Club logo

llm's Introduction

LLM

PyPI Documentation Changelog Tests License Discord Homebrew

A CLI utility and Python library for interacting with Large Language Models, both via remote APIs and models that can be installed and run on your own machine.

Run prompts from the command-line, store the results in SQLite, generate embeddings and more.

Full documentation: llm.datasette.io

Background on this project:

Installation

Install this tool using pip:

pip install llm

Or using Homebrew:

brew install llm

Detailed installation instructions.

Getting started

If you have an OpenAI API key you can get started using the OpenAI models right away.

As an alternative to OpenAI, you can install plugins to access models by other providers, including models that can be installed and run on your own device.

Save your OpenAI API key like this:

llm keys set openai

This will prompt you for your key like so:

Enter key: <paste here>

Now that you've saved a key you can run a prompt like this:

llm "Five cute names for a pet penguin"
1. Waddles
2. Pebbles
3. Bubbles
4. Flappy
5. Chilly

Read the usage instructions for more.

Installing a model that runs on your own machine

LLM plugins can add support for alternative models, including models that run on your own machine.

To download and run Mistral 7B Instruct locally, you can install the llm-gpt4all plugin:

llm install llm-gpt4all

Then run this command to see which models it makes available:

llm models
gpt4all: all-MiniLM-L6-v2-f16 - SBert, 43.76MB download, needs 1GB RAM
gpt4all: orca-mini-3b-gguf2-q4_0 - Mini Orca (Small), 1.84GB download, needs 4GB RAM
gpt4all: mistral-7b-instruct-v0 - Mistral Instruct, 3.83GB download, needs 8GB RAM
...

Each model file will be downloaded once the first time you use it. Try Mistral out like this:

llm -m mistral-7b-instruct-v0 'difference between a pelican and a walrus'

You can also start a chat session with the model using the llm chat command:

llm chat -m mistral-7b-instruct-v0
Chatting with mistral-7b-instruct-v0
Type 'exit' or 'quit' to exit
Type '!multi' to enter multiple lines, then '!end' to finish
> 

Using a system prompt

You can use the -s/--system option to set a system prompt, providing instructions for processing other input to the tool.

To describe how the code a file works, try this:

cat mycode.py | llm -s "Explain this code"

Help

For help, run:

llm --help

You can also use:

python -m llm --help

llm's People

Contributors

almet avatar amjith avatar benjamin-kirkbride avatar cmungall avatar corneliusroemer avatar dependabot[bot] avatar dokeet avatar ealvar3z avatar flabat avatar gohanlon avatar mhalle avatar nakamichiworks avatar pavelkraleu avatar sderev avatar sherwind avatar simonw avatar themaxdavitt avatar till--h avatar will-so avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

llm's Issues

llm logs trigger api request

After setting up the OPENAI_API_KEY and llm init-db when I now write llm logs I get an API response to the prompt "logs" (explanation of logs).

What I expected is to see the logfiles.

Accept input from standard in

Enables this kind of pattern:

curl -s 'https://simonwillison.net/2023/May/15/per-interpreter-gils/' | \
  llm --system 'A hilarious joke about this post' --stream

Output just now:

Why did the Python developer compile Python themselves just to test the Per-Interpreter GIL feature?

Because they wanted to thread the needle!

Reconsider `llm chatgpt` command and general command design

Options for an interactive chat mode:

  • It's a new command - llm chat -m gpt-4 for example. This feels a bit odd since the current default command is actually llm chatgpt ... and llm chat feels confusing.
  • It's part of the default command: llm --chat -4 starts one running.

Maybe the llm chatgpt command is mis-named, especially since it can be used to work with GPT-4.

I named llm chatgpt that because I thought I'd have a separate command for bard and for llama and so-on, and because I thought the other OpenAI complete APIs (the non-chat ones, like GPT-3) may end up with a separate command.

Originally posted by @simonw in #6 (comment)

--code mode for outputting code

Since this is a CLI tool it's nice to be able to > file.py to save generated Python code.

Problem: it usually comes wrapped in triple backticks.

Solution: a --code option which sets a system prompt to avoid that happening.

Consider using XDG_DATA_HOME directory instead of .llm

The XDG base directory spec has a directory that would be perfect for this tool's database: ~/.local/share/llm.

There are two big advantages to using XDG_DATA_HOME instead of ~/.llm:

  • Users can easily configure the directory to use by setting XDG_DATA_HOME
  • Limits the explosion of dot directories in the home directory

here's how I implement it in my similar tool, for reference - use XDG_DATA_HOME if present, otherwise default to ~/.local/share/<app_name>

Include system prompts in llm templates list

Following:

% llm templates list
bad       : this is bad
joke      : Tell a really funny and short joke, surprise me
long      : This is a really long prompt. It's long long long. This is a really long prompt. It's long long long. This is a really long prompt....
recipe    : Suggest a recipe using ingredients: $ingredients  It should be based on cuisine from this country: $country
roast     : 
steampunk : Summarize the following text. Insert frequent satirical steampunk-themed illustrative anecdotes. Really go wild with that. Text to ...
summarize : 
summary   : Summarize this: $input

Move log.db database to the new user_data_dir

Some thoughts on that migration:

Should I have the tool perform a one-off migration when you upgrade it, to move ~/.llm/log.db to the new location?

I think not - instead, I'll have llm init-db take an optional argument for starting the database by copying an existing one, then mention that upgrade path in the release notes.

Originally posted by @simonw in #7 (comment)

Documentation on writing plugins

I'm not going to implement the same one-off plugin mechanism as Datasette, so I'll have to instead teach people how to develop package plugins locally with pip install -e and show them how to create wheel files they can install elsewhere in case they don't want to push packages to PyPI.

Better automated tests

I need to mock the OpenAI API calls so I can write tests against the llm "prompt" command.

Command for browsing captured logs

Best way to do this will be with Datasette or sqlite-utils but it would be neat to have a basic history command built into llm itself.

Updated schema design for 0.4

I want to make a few schema changes in time for 0.4:

  • Add an explicit id integer column so I don't have to remember to select rowid
  • chat_id should be an integer, and a foreign key to id
  • Ditch the concept of a provider and just use the model column

One thing I'm torn on right now: should I keep the system prompt as a separate column, even though most models other than OpenAI don't have that as a concept?

Record actual model used to run the prompt

Right now I'm just recording the model that was requested, e.g. gpt-3.5-turbo in the model column.

But... it turns out the response from OpenAI includes this - "model": "gpt-3.5-turbo-0301" - and there are meaningful differences between those model versions, e.g. the latest is gpt-3.5-turbo-0613 but you have to opt into it.

I'd like to record the model that was actually used. Not sure how best to put this in the schema though, since it may only make sense for OpenAI models.

Better ways of storing and accessing API keys

I'm not sure this is actually an issue, as I've developed a workaround, but I thought it was worth bringing up for discussion.

I prefer to keep keys like this in my password manager. Among other things, it allows secure access and consistent sync across machines. I already had a function to access the key, but I don't want to call it on every new shell session as it pops up a prompt in my password manager. I'd prefer to only do that when using the tool, and only the first time in the shell session.

So, I wrote a wrapper function to do that:

llm() {
  if [ -z "$OPENAI_API_KEY" ]; then
    export OPENAI_API_KEY=`open_ai_key`
  fi
  command llm "$@"
}

I use zsh on macOS.

I'm not aware of other patterns that llm could use to look for a key, and you already provide two reasonable ones...but if there was a third way that would obviate the need for my little wrapper, that would be cool! Otherwise, maybe someone else finds this helpful.

core.LLM class exposing most of the functionality as a Python API

Idea came from here:

Or... maybe it takes an llm argument which is similar to datasette in that it's an object offering a documented API for various useful things, like looking up configuraiton settings and loading templates and suchlike.

@hookspec
def register_models(llm):
    """Return a list of Models"""

Originally posted by @simonw in #53 (comment)

Made me realize that this tool could work like sqlite-utils in that most features could be available both as Python API methods and as CLI commands.

Mechanism for storing prompt templates

This will allow users to store templates for complex prompts (both as system prompts and regular prompts that have strings interpolated into them) so they can use them in the future.

llm templates edit summary
# An editor opens to edit that prompt
llm --template summary "$(curl -s https://www.example.com/)"

With a shortcut so llm -t summary works too.

Bug: llm templates list doesn't skip newlines

If any templates have newlines the output gets very confusing:

% llm templates list
bad       : this is bad
joke      : Tell a really funny and short joke, surprise me
long      : This is a really long prompt. It's long long long. This is a really long prompt. It's long long long. This is a really long prompt. It's long long long. This is a really long prompt....
recipe    : Suggest a recipe using ingredients: $ingredients

It should be based on cuisine from this country: $country
roast     : 
steampunk : Summarize the following text.
Insert frequent satirical steampunk-themed illustrative anecdotes. Really go wild with that.
Text to summarize: $input

summarize : 
summary   : Summarize this: $input

Initial design

Initially this tool will let you run things against ChatGPT and GPT-4 from the command-line.

Over time I want to introduce Pluggy plugins to allow you to hook it up to all sorts of other language models, including ones that run locally.

But for starters it will do this:

llm "Prompt goes here"

And a streaming variant:

llm "Ten ideas for cheesecakes" -s

Plus use -4 to run against GPT-4, or --model X to specify another model.

Improvements to logs command: show rowid, support --truncate

Refs:

In order to find the chat ID that I should use to continue a conversation, llm logs needs to include the rowid in the output.

Since prompt responses can be really long, it would be useful to provide an optional -t/--truncate option for truncating them.

Drop platformdirs for click.get_app_dir()

I just noticed Click already has functionality for app directories, which I can use instead of the extra platformdirs dependency: https://click.palletsprojects.com/en/8.1.x/utils/#finding-application-folders

cfg = os.path.join(click.get_app_dir(APP_NAME), 'config.ini')

It looks like I can pass this io.datasette.llm:

>>> import click
>>> click.get_app_dir("hello")
'/Users/simon/Library/Application Support/hello'
>>> click.get_app_dir("io.datasette.llm")
'/Users/simon/Library/Application Support/io.datasette.llm'

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.